algebraic methods ii: theory, tools and applications

Lecture Notes in Computer Science Edited by G. Goos and J. Hartmanis

490

J.A. Bergstra L.M.G. Feijs (Eds.)

Algebraic Methods Ii: Theory, Tools and Applications

Springer-Verlag Berlin Heidelberg NewYork London Paris Tokyo Hong Kong Barcelona Budapest

Editorial Board

D. Barstow W. Brauer R Brinch Hansen D. Gries D. Luckham C. Moler A. Pnueli G. Seegrni.iller J. Stoer N. Wir th

Volume Editor

Jan A. Bergstra Department of Computer Science, University of Amsterdam R O. Box 41882, 1009 DB Amsterdam, The Netherlands

Loe M.G. Feijs Philips Research Laboratories R O. Box 80.000, 5600 JA Eindhoven, The Netherlands

CR Subject Classification (1991): C.2.2, D.1-3, F.3

ISBN 3-540-53912-3 Springer-Verlag Berlin Heidelberg NewYork ISBN 0-387-53912-3 Springer-Verlag NewYork Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereof is only permitted under the provisions of the German Copyright Law of September 9, 1965, in its current version, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law. © Springer-Verlag Berlin Heidelberg 1991 Printed in Germany Printing and binding: Druckhaus Beitz, Hemsbach/Bergstr. 2145/3140-54-3210- Printed on acid-free paper

Preface

This volume originates from a workshop organized by ESPRIT project no. 432 METEOR in Mierlo, The Netherlands, September 1989. The workshop was a successor to an earlier one held in Passau, Germany, June 9-11 1987, the proceedings of which have been published as Lecture Notes in Computer Science Volume 394.

At the workshop, six invited talks were given by A. Finkelstein, C.B. Jones, P. Klint, C.A. Middelburg, E.-R. Olderog and H.A. Partsch.

The program committee consisted of

M. Wirsing, H. Perdrix, J.A. Bergstra, J.C.M. Baeten, L.M.G. Feijs, J. Hagelstein, F. Ponsaert, M.-C. Gandel, R. Zicari.

This volume contains five invited contributions and ten papers by the METEOR team based on talks that were presented at the workshop. The invited talk of Jones led to a paper by Feijs on modularizing the formal description of a database which has been included as well.

The program committee would like to thank P. Wodon (project leader of METEOR), A. Bradier (ESPRIT project Officer) and PRLE (organizer of the workshop). The finan- cial support of the following partners of the METEOR project

Philips Research Laboratories Brussels, Philips Research Laboratories Eindhoven, Compagnie G6n6rale d'Electricit~, LRI - Universit~ Paris-Sud, ATT & Philips Telecommunications, Centrum voor Wiskunde en Informatica, TXT, Politechnico di Milano, Universit£t Passau

is gratefully acknowledged.

Finally, as the editors of the volume we would like to thank R.D. van den Bos for his initiative and help in preparing this volume and Springer-Verlag for their excellent cooperation concerning the publication of this volume.

Eindhoven, January 1991 Jan A. Bergstra, Loe M. G. Feijs

Table o f C o n t e n t s

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Part I. Invited Contributions Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Formalizing Informal Requirements. Some Aspects . . . . . . . . . . . . . . . 7 N.W.P. van Diepen, H.A. Partsch

Viewpoint Oriented Software Development: Methods and Viewpoints in Requirements Engineering . . . . . . . . . . . . . 29 A. Finkelsteln, M. Goedicke, J. Kramer, C. Niskier

Using Transformations to Verify Parallel Programs . . . . . . . . . . . . . . . 55

E.-R. Olderog, K.R. Apt

Experiences with Combining Formalisms in VVSL . . . . . . . . . . . . . . . . 83

C.A. Middelburg

A Meta-environment for Generating Programming Environments . . . . . . . . 105 P.Klint

Part I I . R e q u i r e m e n t s and Design Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Introducing Formal Requirements into Industry . . . . . . . . . . . . . . . . . 129

J. Hagelstein, F. Ponsaert

Where can I Get Gas Round Here? - An Application of a Design Methodology for Distributed Systems . . . . . . . 143 R. Weber

Transformations of Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 L.M.G. Feijs

Part I I I . C O L D

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Norman 's Database Modularised in COLD-K . . . . . . . . . . . . . . . . . . . 205 L.M.G. Feijs

vl

POLAR: A Picture-Oriented Language for Abstract Representations . . . . . 233 R.D. van den Bos, L.M.G. Feijs, R.C. van Ommering

Inheritance in COLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

H.B.M. Jonkers

A Process Specification Formalism Based on Static COLD . . . . . . . . . . . 303

J.C.M. Baeten, J.A. Bergstra, S. Mauw, G.J. Veltink

Part IV. Algebraic Specification Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

Specification of the Transit Node in PSFd . . . . . . . . . . . . . . . . . . . . 341 S. Mauw, F. Wiedijk

Design of a Specification Language by Abstract Syntax Engineering . . . . . . 363 J.C.M. Baeten, J.A. Bergstra

From an ERAE Requirements Specification to a PLUSS Algebraic Specification: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

A. Mauboussin, H. Perdrix, M. Bidoit, M.-C. Gaudel, J. Hagelstein

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433

In troduct ion

This volume is divided in four parts. Part I contains the invited lectures. These lectures cover a variety of topics ranging from requirements engineering and transformational design to the construction of programming environments and the design of wide-spectrum languages. Part II, III and IV contain papers from the METEOR team.

The rationale for the grouping of papers is the following: as COLD is a major result of the METEOR project all information about COLD has been collected in one part (III). COLD is an algebraic technique because it starts out from sorts, functions and algebras. It extends the conventional algebraic paradigm by incorporating features from sequential imperative programming, dynamic logic and first and second order predicate logic.

Because conventional algebraic specification techniques based on equational logic have played a key role in METEOR, contributions in that area have been collected in a single part as well.

Part II collects papers on topics that were of secondary, but still vital importance to METEOR: requirements engineering, design and transformation.

PART I

I n v i t e d C o n t r i b u t i o n s

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Formalizing Informal Requirements. Some Aspects . . . . . . . . . . . . . . . 7

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Requirements specification . . . . . . . . . . . . . . . . . . . . . . . 8 3 Formal specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 The process of formalization . . . . . . . . . . . . . . . . . . . . . . 16 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Viewpoint Oriented Software Development: Methods and Viewpoints in Requirements Engineering . . . . . . . . . . . . . 29

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2 Method support for requirements formalisation . . . . . . . . . . . . 30 3 Incremental development of formal specifications . . . . . . . . . . . 33 4 Tool support for requirements engineering . . . . . . . . . . . . . . . 36 5 Modelling requirements elicitation . . . . . . . . . . . . . . . . . . . 38 6 Concept of viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . 40 7 An outline of a simple example . . . . . . . . . . . . . . . . . . . . . 44 8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Using Transformations to Verify Parallel Programs . . . . . . . . . . . . . . . 55

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4 Asynchronous fixed point computation . . . . . . . . . . . . . . . . . 61 5 Parallel zero search . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Experiences with Combining Formalisms in VVSL . . . . . . . . . . . . . . . . 83

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2 VVSL: the VDM Specification Language . . . . . . . . . . . . . . . . 85 3 VVSL: combining VDM and temporal logic . . . . . . . . . . . . . . 87 4 VVSL: the language of temporal logic . . . . . . . . . . . . . . . . . 89 5 Transforming VVSL to COLD-K . . . . . . . . . . . . . . . . . . . . 90 6 Transforming VVSL to the language of MPL~ . . . . . . . . . . . . . 92 7 COLD-K extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8 Transforming temporal formulae . . . . . . . . . . . . . . . . . . . . 95 9 Transforming definitions of (non-atomic) operations . . . . . . . . . 96 10 Experiences with the application of VVSL . . . . . . . . . . . . . . . 98 11 Conclusions and final remarks . . . . . . . . . . . . . . . . . . . . . 99

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

A Meta-environment for Generating Programming Environments . . . . . . . . 105

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 2 ASF +SDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 3 Global organization of a meta-environment for ASF+SDF . . . . . . 109 4 The representation of logical syntax . . . . . . . . . . . . . . . . . . 113 5 Looking inside the generic syntax-directed editor . . . . . . . . . . . 115 6 Editing in the meta-environment . . . . . . . . . . . . . . . . . . . . 119 7 Implementation techniques . . . . . . . . . . . . . . . . . . . . . . . 121 8 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

I n t r o d u c t i o n

We briefly survey the invited papers.

Van Diepen ~ Partsch discuss the formalisation of informal requirements acquisition. The discussion is based on a case study and leads to requirements on formalisms for requirements definition and a description of formalization as a process.

Finkelstein et al. introduce viewpoint oriented software development. Their method is technically based on FOREST and its underlying mathematical foundation MAL (Modal Action Logic). To these ideas it adds so-called structured common sense (SCS). The method is illustrated by examples.

Klint describes a meta environment for generating programming environments. His work was done in ESPRIT project no. 348 GIPE. The environment is a part of the computer system that incorporates formalisms and subsystems such as TYPOL, ASF, SDF and METAL. In particular Klint describes the environment generator for ASF and SDF, where ASF constitutes a Spartan syntax for structured algebraic specifications to which ASD adds a significant amount of user oriented syntactic freedom.

The paper "Using transformations to Verify Parallel Programs" by K. Apt and E.-R. Olderog addresses the construction of parallel programs that formally satisfy a pre- and postcondition style specification. The approach is to use program trar.sformations which leads to significant simplifications and which can be used in combination with the proof method of Owicki and Gries.

The paper of C. Middelburg reports on the integration of language concepts into a wide- spectrum language. He combines VDM with a language of temporal logic. There are several links with METEOR here. First the role of temporal logic has been investigated for requirements engineering in METEOR; in particular ERAE is based on temporal logic also. Secondly VVSL reflects strong influences from COLD; from COLD it gets its modularization and parameterization mechanisms. Also the way of translating VVSL to MPL~ is derived from the formal semantics of COLD. Finally the integration of languages and concepts into a wide-spectrum language is very difficult and will be a research topic for the near future.

Formalizing Informal Requirements Some Aspects

N.W.P. van Diepen H.A. Partsch

University of Nijmegen* Department of Computer Science

Toernooiveld 1 6525 ED Nijmegen The Netherlands

Abstract

Formal specifications are nowadays considered as an important intermediate stage in the software development process. There are various approaches for constructing an efficient program satisfying a given formal specification. The formalization process, however, has not yet been investigated as thoroughly. Thus, it is still one of the main sources for inconsistencies between the wishes of the customer and the program finally delivered. Some problems to be solved during formalization are identified and illustrated with a real-world example.

1 I n t r o d u c t i o n

In its widest sense, software development means

"given a problem, find a program (or a set of programs) that (e.~icientIy) solves the problem"

where program may be taken as synonymous with software. The major difficulty in software development is caused by the fact that the original problem description

usually consists of a bunch of half-baked wishes which are neither precise or detailed, nor even complete. The program, by nature, has to be precisely defined and fully detailed up to each single instruction. It is obvious that software development done in one large step to bridge the huge gap between these extreme positions is doomed to fail, i.e., the resulting software probably does not work as expected.

There are various reasons why software might not work properly. Very often, the problem given originally was simply misunderstood or misinterpreted. Therefore, it is widely accepted today that the process of software development should be broken into smaller, manageable, steps in the framework of so-called "life cycle models". A minimum requirement is a decomposition into two steps (frequently called "requirements engineering" and "program construction") with a precise, possibly formal, statement of the problem as an intermediate stage (cf., e.g., also [Balzer et al. 83], [Agresti 86], [Bauer et al. 89]).

Such a formal problem specification states precisely and unambiguously the "task" to be fulfilled by the software, i.e., it describes what the problem is without giving a direct solution or even the details about its implementation. Additionally, it entails a "separation' of concerns" which allows early checks on whether the informal wishes are properly reflected and thus prevents superfluous implementation work.

There are various approaches focusing on the program construction part of this development paradigm, viz. how to construct an efficient program that satisfies a given formal specification, e.g., by transformations (for overviews, see [Partsch Steinbriiggen 83], [Feather 86]), or assertional techniques ([Dijkstra 76], [Gries 81], [Backhouse 86]). The requirements engineering part, although at least as important, has not yet as thoroughly been investigated in the context of these new approaches and paradigms: However, a lot of work in this area has been done within trnditional software engineering. Therefore, in the following

*Support has been received from the Netherlands Organization for Scientific Research N.W.O. under grant NF 63/62-518 (the STOP - - Specification and Transformation Of Programs - - project).

section we at tempt to shed some light on the problems to be encountered from a somewhat wider viewpoint. In section 3 we then will touch upon the particular problems of formal specification~ and in section 4 we introduce some ideas on how to obtain formal specifications from informal problem statements. Section 5 contains some concluding remarks.

2 Requirements specification In traditionM software engineering, a problem specification is usually ca~ed a requirements specification. It is defined as ([IEEE 83]):

"A specification that sets forth the requirements for a system or system component; for example, a software configuration item. Typically included are functional requirements, interface requirements, design requirements, and development standards. ~

where in turn requirement is defined by

1. "A condition or capability needed by a user to solve a problem or achieve an objective. ~

2. "A condition or capability that must be met or possessed by a system or system component to satisfy a contract, standard, specification, or other formally imposed document. The set of all requirements forms the basis for subsequent development of the system or system component."

From practice it is known that requirements appear as a huge, unstructured and unreflected mass of information that has to be analysed, organized and documented in a suitable way. To deal with this mass, more has to be known about requirements and requirements specification, e.g.,

• what are the contents of the requirements specification, i.e., what different kinds of requirements can be found,

• which general properties should a formalism for requirements specification satisfy, and

• how should one proceedto obtain a (formal) requirements specification that properly reflects the original intentions.

These aspects will be dealt with in sections 2.2 through 2.4. For illustration purposes, section 2.1 introduces a non-trivial example from practice, which will be used throughout this paper.

2.1 A p r a c t i c a l e x a m p l e

The nature of informal specifications bears the risk of writing in broad generalizations without any technical depth. Hence we have chosen a real-world example, both to focus our treatment and to illustrate our views. This example, the so-called Swiss system, will be described in some detail in the remainder of this subsection. It has been selected because a real-world problem helps in keeping a fairly unbiased view of the subject. Furthermore, a good, for the purpose of human application sufficiently complete, not too lengthy, informal specification of this problem is known. Also, i t is not one of the "insider problems" from computing science, which would cloud the discussion with various standard solutions.

2.1.1 The Swiss system

The Swiss system is a tournament system designed to allow many participants to play a chess tournament in a limited number of rounds, both avoiding the drawbacks of round-robln or "all play all" tournaments, familiar from most national soccer championships (i.e., limited capacity or long duration), and knock-out tournaments, known from tennis (i.e., fast dropouts). The system was introduced in 1895 by Dr. J. Muller in Ziirich. Since then it has been used in many variations at chess tournaments, and (sometimes adapted to the circumstances) at bridge, dames and go tournaments as well. The basic idea is as follows:

1. in every round, each player is paired with an opponent with an equal score (or as nearly equal as possible);

2. two participants are paired at most once;

3. after a predetermined number of rounds the leading player wins.

So, in round one, some random pairing is made. In round two all the winners play each other. The same holds for the players with a draw, and the losers. If there is an odd number of winners one of them plays a person with a draw (or a loser, but only when there are no people with drawn games). In round 3, players with 2 points play each other, players with 1½ points, etc. Again, if we have an odd number of players then one is selected to play someone of an adjoining group.

Many variations of the Swiss system exist, mainly to acomodate for particular circumstances. For instance, in chess tournaments players heavily favour playing with the white pieces. Hence colour allocation is important to ensure fair competition. Or participants may not wish to play their own clubmates since they can do so at home.

Various attempts have been made to implement the pairing algorithm of the Swiss system. However, a really satisfactory solution is, to our knowledge, still non-existent. Van den tIerik ([Herik 88]) recently reported on a partially unsuccessful attempt by some students, called ZORBA (for "Zwitsers Op Rating BAsis" - - Swiss on rating base). Rather than being forced to keep up with the original problem specification, they were able to influence the description of their version of the Swiss system during implementation. This has been done to allow for an easier implementation on the one hand, and to eliminate ambiguities in the description discovered during the implementation effort on the other. The effort of finding a better description even resulted in a new version of a rulebook for the Swiss system, ([Gijssen Haggenburg 88]). Still, even under these rather optimal implementation conditions, some problems remained in the final version, mainly concerning the problem of finding pairings in extreme cases.

Problems similar to the Swiss system pairing problem have been studied in combinatorics (cf. [Polya et al. 83]), e.g., the problem of allocating people to jobs in the most efficient way. Unfortunately, for our problem, which can be seen as a generalization of the job allocation problem, there is no known efficient solution.

2.1.2 A rule se t for t h e Swis s s y s t e m

To focus our attention we have taken the description of the Ratings controlled Swiss system (U.S. Chess Federation form) from [Ka~id 80] pp. 31-39, in a condensed form, leaving out some variants and exceptions. The essential rules for this version are:

O. Initial remarks. This Swiss system version assumes that all participants are given a rating, describing their playing strength. Unrated players are given a guessed rating, so it is assumed that each player is rated before the tournament, with ties decided at random. This rating order remains the same throughout the tournament and is used heavily in making pairings.

1. Pairing cards. A pairing card is made out for each player on which the tournament director records for each game the colour of the player's pieces, the opponent's name and identification number, the player's score in the game, and the player's cumulative tournament score.

2. Identification Numbers. After the entry list is closed, all pairing cards are arranged in the order of the players' ratings. Players with identical ratings are arranged by lot. Then the identification numbers of all players are entered on the pairing cards, starting with the highest-rated player as No. 1.

3. Byes. If the total number of players in any round of a tournament is uneven, one player is given a bye. A player must not be given a bye more than once. In the first round the bye is given to the player with the lowest official rating, in subsequent rounds to the lowest-ranked eligible player, rank in this case being determined first by score, then by official rating.

4. Scoring. The scoring is one point for a win or bye, one-half point for a draw, zero for a loss.

5. Basic Swiss system Laws. All pairings are subject to the following basic Swiss system Laws.

(a) A player must not be paired with any other player more than once.

(b) Players with equal scores must be paired if it is possible to do so.

(c) If it is impossible to pair all players with equal scores, every player who is not paired with an opponent with equal score must be paired with an opponent whose score is as close to his own as possible.

]0

6. Pairing the first round. After the bye, if any, is given, the pairing cards are arranged in rating order and are divided into two equal groups. The players in the top half are paired in consecutive order with those in the bot tom half. For example, if there are forty players, No. 1 is paired with No. 21, No. 2 with No. 22, etc.

7. Pairing subsequent rounds - score groups and rank. In these rules the term 'score group', or simply 'group', is used in reference to a group of players having the same score. Sometimes a group may consist of only one player. Individual ' rank' is determined first by score, then by rating order.

8. Order of pairing groups. In general, the order of pairing is from the group with the highest score down to the group with the lowest score. Occasionally the pairing of the lower score groups may have to be adjusted to conform to the basic Swiss system Laws, if many of the players in those groups have met before.

9. Method of pairing each score group. In the second and as many of the subsequent rounds as possible, the players are paired as follows:

(a) Any odd men are paired first as described in rules 10-12.

(b) Within each score group, after the odd man, if any, has been removed, the cards of the remaining players are arranged in rating order and divided into two equal sections. The players in the top half (with the higher ratings) are paired with those in the bottom half (with the lower ratings) in as close to consecutive order as possible. Transpositions in the bot tom half of a group are made to make the pairing conform to the basic Swiss system Laws and to give as many players as possible their due colours (rules 15-17). If it is impossible to meet the two requirements just mentioned, one or two players in the top half may be interchanged with one or two players in the bottom half. Every effort should be made, however~ to observe the principle of pairing the higher-rated against the lower-rated players.

Note. Directors differ somewhat in their exact methods for implementing this procedure, but any reasonable method, followed consistently, is acceptable.

10. Odd men. If there is an odd number of players in a score group, the lowest-ranked player is ordinarily treated as the odd man. However, the pairings in the group must accord with the basic Swiss system Laws. Sometimes two players who have met in a previous round must be treated as odd men because there is no possible way in which either of them can be paired in their original group.

11. Method of pairing one odd man. The odd man is paired with the highest-rauked player he has not met in the next-lower score group.

12. Method of pairing more than one odd man. If there are two odd men to be paired, the order in which they axe paired is determined by their rank according to rule 7. If both cannot be paired, rank determines which is paired and which is moved to another group.

13. Colour allocation - general principles. The primary objective is to give white and black (nearly) the same number of times to as many players as possible. After the first round, as many players as possible should be given their due eolours (rules 15-17).

14. First round eolours. In the first round the colour assigned to all the odd-numbered players in the top half is chosen by lot, and the opposite colour is given to all the even-nmnbered players in the top half. Opposite colours are assigned to the opponents in the bottom half of the field as the pairings are made. (Once the first round colours are thus chosen by lot, rules 15-17 preserve equitable colour allocation, and no further lots are necessary.)

15. Due eolours in succeeding rounds. As many players as possible are given their due colours as described in rules 16 and 17, so long as the pairings conform to the basic Swiss system Laws. Equalization of colours takes priority over alternation of colours.

16. Equalization of colours. As many players as possible are given the colour that equalizes the number of times they have played with the white and black pieces. When it is necessary to pair any two players who are due to being given the same equalizing colour, the higher-ranked player has priority in getting the equalizing colour, whether white or black.

11

17. Alternation of colours. After colours have been equalized in a round, as many players as possible should be given, in the next round, the colour each received in the first round of the tournament, the purpose being to continue alternation of colours. When it is necessary to pair any two players who are due to be given the same alternating colour, the higher-ranked player has priority in getting the alternating colour, whether white or black. However, if another pairing can be made in accordance with the basic Swiss system Laws, a player should not be assigned the same colour three times in a row. Interchanges between the top and bottom halves should not be made simply to preserve alternation of colours.

2.1.3 Further constraints

Some additional constraints are in order for a Swiss system program to be used in practice. Usually tournaments are held at a place where the computing facilities are limited to one microcomputer. Also time between rounds is typically very limited, with the need to process all results, including the pairing, in less than 30 minutes. Preferably, no special skills should be required from the user, both at the level of application of the Swiss system, and application of the program. So without going into much detail we would like to fix the following environment conditions:

1. the program should run on commonly used microcomputers without assuming a particular configuration or special capabilities of the device (e.g., size of the screen);

2. the program should be able to make a pairing within 5 minutes for groups of n players, 12 < n < 1000, for rounds up to Ivan] ;

3. the program should be usable by someone with only basic knowledge of the use of a computer and the Swiss system; and

4. the produced pairings should be of good quality, i.e. a manual check is not likely to significantly improve upon it. (Note: this is not a trivial requirement. ZORBA has been used as a shadow system to manual pairings.)

In the sequel these constraints will be referred to as constraints no. 1 through 4.

2.2 Different Kinds of Requirements

The Swiss system example shows very clearly that in practice, requirements may be of varying nature and quality. Therefore in this subsection we try to elaborate on the various kinds of requirements to be expected in a real-world problem.

Roughly, requirements can be split up into functional requirements and non-functional ones (cf. [Yeh 82]). More detailed and comprehensive characterizations can be found, e.g., in [Roman 85] and [K/ihne] et al. 87].

Functional requirements deal with the behaviour of a system and its environment ("conceptual model", [Balzer Goldman 79]). Typically they comprise of:

• inputs to the system and their constraints (e.g., the data on the "player cards", cf. rule 1, or the scoring of results, cf. rule 4),

• functions the system is able to perform (e.g., making a new pairing),

• outputs and other reactions of the system (e.g, updating and printing the pairing produced~ asking for help in ambiguous situations);

The focus of the remainder of the paper will be on functional requirements, so we will not go into further detail here. Rather, we treat non-functional requirements in this section to a broader extent, to be able to dispense with more than casual reference furtheron.

Non-functional requirements (sometimes called *'constraints") can be divided into different categories:

(a) quality attributes of the desired individual functions:

• performance (time, storage, workload, throughput - cf. constraint 2),

• maintainability (changes in individual functions should be feasible in a local way),

]2

• reliability (failure safety (e.g., a system crash should not destroy all previous input), robust- ness, integrity of internal information, error-recognition, error-handling and survivability - of. constraints 2 and 4),

i portability (cf. constraint 1),

• adaptability (of. constraint 1, e.g., a possibility for adapting to exploit a full-size screen),

• compatibility with existing systems (e.g., a parser, a file system, a database of game results),

• reusability (i.e., modules should be structured, parameterized, and documented in a proper way),

• flexibility and extensibility (in order to satisfy additional requirements during the system lifetime),

• traceability (i.e., the possibility of recognizing the relationship between the original requirements and the specified functions),

• user comfort (cf. constraint 3);

(b) requirements for the implementation of the system:

• tools or devices to be used (e.g., existing software/hardware- cf. constraint 1, minimum memory requirements),

• interfaces with already existing components (e.g., with a text formatting system to produce camera ready forms for the ~touruament results),

• use of existing tools (programming language, operating system, hardware),

• documentation (describing, e.g., details about how to install the system);

(c) requirements for the development process:

• global development strategies (a division of the system into "independent" components),

• methods, languages, tools to be used (often some not explicitly mentioned standard known to all programmers involved),

• available resources (manpower, budget, deadlines - of. constraint 1),

• quality attributes to be achieved and standards to be obeyed (cf. constraint 4);

(d) requirements for test, installation and maintenance:

• physical constraints (size, weight),

• availability of qualified personnel,

• skill level considerations (cf. constraints 3 and 4),

• spatial distribution of components (e.g., availability of a nearby printer);

(e) economical and political constraints:

• market considerations (cf. constraint 1),

• cost/benefit ratio (e.g., the trade-off between a general purpose and a customized system),

• legal restrictions (use of copyrighted software).

Informal requirements clearly cannot uniquely be characterized. This may cause trouble in practice because usually the different categories of requirements may be of greater or lesser importance to the user. Often requirements will be contradictory (e.g., fast, but on a small machine), which might not be clear at first inspection. Hence, finding these potential conflicts and stating a trade-off of relative importance between the requirements or defining primitives involved is important.

Obviously, there is a fundamental difference between functional and non-functional requirements: in order to be able to formulate non-functional requirements in a precise way, the functionality of a system has to be known. It is particularly for this reason that most of the approaches in requirements engineering mainly concentrate on providing formalisms to describe functional requirements. We form no exception here in concentrating on functional requirements for the remainder of this paper.

13

2.3 D e s i r a b l e P r o p e r t i e s of F o r m a l i s m s for R e q u i r e m e n t s D e f i n i t i o n s

Apart from being suited for expressing the various kinds of requirements as discussed in the previous subsection, a formalism for describing requirements has to cope with additional aspects originating in practical considerations, e.g., it has to be able to deal with problems to be encountered in building a requirements definition and in constructing software that satisfies the requirements.

Typical problems when building a requirements definition are:

• uncertainty in what the problem is, due to uncertainty or to not measurable requirements (e.g., "if possible...", cf. rule 5b);

• incomplete information about the problem (e.g., the note to rule 9b);

• coordination and consistent integration of different sources of information (customer, user, technical expert, chess player, organizer, software developer);

• mass of information, easily leading to redundancy (e.g., rule 6 is a special case of rule 9b - - before round 1 only one score group exists), and hence risking overspecification and inconsistency;

• different levels of detail in the requirements (e.g., rule 13, describing the general principle of colour allocation, versus rules 15-17, which give a procedure to be followed to implement this same general principle);

• exclusion of feasible solutions, (e.g., an additional rule in ZORBA excluded solutions with colour allocation more than 2 games out of balance, causing problems in special cases);

Typical problems in connection with the development process are:

• organization of the software development process in a manageable and reliable way;

• traceability and verification of an implemented system (with respect to its requirements);

• modification, enhancement, and maintenance;

• diversity of problems, making expertise gained in previous projects of uncertain value;

• estimation of the amount of effort needed.

In order to cope with these problems, the following properties of formalisms for requirements definition are desirable (of. also, e.g., [Balzer Goldman 79], [Fairtey 85], [Henderson 81], [Roman 85], [Yeh Zave 80]):

precision and formality. In order to discover flaws at the earliest possible stage of the software development process, a precise and unambiguous statement of the problem to be solved is mandatory. The (implicit) desire for completeness and consistency can only be satisfied by a sufficient level of formality. Formality is also needed for being able to establish a formal correspondence (verification) between requirements definition and implemented system.

abstraction and structuring. Mastering complexity resulting from a mass of information requires abstraction mechanisms and suitable concepts for structuring.

conceptual integrity. A specification formalism has to be an integrated part of an overall software development methodology (cf. [Henderson 81]), Otherwise a smooth, manageable, and consistent development process leading to reliable software cannot be expected.

readability and understandability. A requirements specification is the interface (the "contract", [Bauer 81]) between client and software developer, and a means of communication among clients, users, experts, analysts, and designers. Thus, a formalism should be such that all parties involved will be able to read and understand the formulated requirements with reasonable effort.

modifiability. Software products are subjected to continuous changes ("pressure of change is built-in ~ [Lehman 80]), due to changing environments (and hence requirements). Obviously this entails a need for easy and consistent modifiability of a requirements definition, and a suitable formalism has to take care of this issue.

14

liberality. A specification should not enforce a single or a particular solution, but rather allow a variety of implementations ( "specification freedom" [London Feather 82], ~a family of solutions" [Yeh Zave 80]). Hence, the formalism should provide constructs allowing one to express that kind of freedom.

adequacy. A specification method has to provide means to increase confidence - - especially on the customer's side - - that the formal description really reflects his original intentions. For any kind of questions concerning the problem it should be possible to get answers that are formally justified on the basis of the specification.

wide range of applicability. Using a new formalism for every new problem is not feasible in practice. There- fore, a formalism for requirements definition must be capable of dealing with a wide range of different problems.

support by appropriate tools. Documentation, administration and analysis of the information contained in the requirements definition of a large problem are impossible to be managed without suitable tools. Thus, computer support for any kind of formalism should be aimed at. This also implies that a formalism should be machine supportable, e.g., by being unambiguously parseable.

2.4 H o w to P r o c e e d

Assuming the availability of an adequate specification formalism, there is still the problem of methodology, i.e. how to proceed in order to build a requirements specification. A rough guideline is given in [Rzepka Ohno 85]:

"requirements engineering is a systematic approach to the development of requirements through an iterated process of analysing the problem, documenting the resulting requirements insights, and checking the accuracy of the understanding so gained."

Individual activities that are to tal~e place during this iterated process are, e.g.,

investigation of requirements:

• identification of the functional requirements in a dialogue between specifier, customer, and user,

• agreement on quality attributes (maybe including priorities or preferences) and other constraints,

* exploration of the environment for the system and its development;

formulation of requirements:

• precise formulation of all individual requirements,

• description of possible relationships between them,

• systematic structuring and classification;

analysis of requirements:

• formal checks for consistency and completeness,

• adequacy of the formulation,

• investigation of the technical feasibility ("Is the problem solvable at all by an algorithm?", "Are the requirements satisfiable with respect to the constraints on the intended environment?"),

• study on the economical feazibility (overall costs, schedule, required personnel, cost/benefit ratio, risks),

• rapid prototyping and other simulations (to test user acceptance).

We will come back to these issues when dealing with the formalization process in section 4.

15

3 F o r m a l Specification

As already mentioned, formality is entailed by the demand of a requirements specification being consistent and complete. Furthermore, when asking for formality, there seems to be a consensus in the relevant literature that the level of formality provided by existing programming languages is not the appropriate one. For a requirements specification, the goal is a clear and precise description of the problem, rather than a formulation of a way to solve it.

Formality is a delicate issue, in particular since it cannot be seen independently of other desirable properties such as readability and understandability. On the one hand one would like to have the precision and formal foundation of mathematics, but, on the other hand, one would prefer to have the understandability and wide range of applicability as provided by natural language (cf. [Henderson 81]).

Trying to achieve a compromise, traditional approaches (cf., e.g., [IEEE 7?], [Roman 85]) introduce for- real concepts only to an extent that is still manageable by a non-expert user. They provide only simple linguistic means for formulating the different kinds of requirements, mainly relying on an intuitive understanding of the semantics. Additionally, some of them are backed by methodological principles to ensure a systematic conversion of an informal problem statement into the respective formalisms. Nearly all of them, however, do not take subsequent steps in software development into account, i.e. they leave open how to obtain programs that solve the specified problem, and, furthermore, how to verify that these programs indeed meet the specification. Thus the essential drawbacks of these approaches are

• semantic imprecision (remaining ambiguities, no formal checks on consistency and completeness),

• lack of an integrated methodology (no formal verification),

• insufficient support for checking adequacy (no formally derived answers to questions on the problem).

There are a number of new approaches that focus on formalisms and integrated methodological support for (formally) constructing programs from a given formal specification of the problem. All of them assume a rigorous formal basis for an initial problem specification which is, e.g.,

• relational (e.g., Gist [Balzer 81], EREA [Dubois et al. 88]),

• functional (e.g., [Henderson 80], [Bird Wadler 88], VDM [Jones 80]),

• predicative (e.g., [Hehner et al. 86], [Broy 87], Z [Abrial et al. 79]),

• assertional (e.g., [Dijkstra 76], [Giles 81], [Backhouse 86]), or

• algebraic (e.g., ACT ONE [Ehrig et al. 83], ACT TWO [Fey 86], ASF [Bergstra et al. 89], ASL [Wirsing 83], CLEAR [Burstail Goguen 80], COLD [Jonkers et al. 86], [Feijs et al. 8?], LARCH [Guttag Horuing 83], OBJ [Goguen Tardo 77], [Goguen Meseguer 81], [Futatsugi et al. 85], PLUSS [Gaudel 85], RAP-2 [Huflmann 87]).

Since these approaches do have a formal semantic basis, most of the above-mentioned drawbacks can be removed. This is at the price, however, of restricted expressiveness, new difficulties caused by the formalization process, and difficulties in reading and understanding.

Each of the approaches mentioned above has its strengths for particular aspects of a requirements specification. But none of them alone is powerful enough to cope with all kinds of requirements mentioned in section 2.2. Therefore, combinations and extensions, or even completely new formalisms, have to be looked for. How such an adequate formalism might look like is still a topic of research.

We are still convinced that an algebraically based approach is appropriate (cf. [Partsch 86], [Partsch 89]), because it meets nearly all the additional properties in connection with requirements definitions given in section 2.3 (cf. [Partsch 87]). However, clearly, extensions are needed to enhance expressiveness, such as higher-order functions (cf. [M6lier 87], [MSller et al. 88]), specification-building operations (like those as, e.g., in [Wirsing 83]) and relations (to be able to formulate certain non-functional requirements). Fur- thermore, in order to be able to formulate expressions over algebraic specifications or to express other non-functional requirements, conventional applicative constructs are needed, as well as more advanced concepts, such as non-determinism (for delayed design decisions w.r.t, specification freedom), predicate logic (for all kinds of conditions, properties, and constraints), modal and temporal logic (for real-time and other

16

behavioral aspects), or traces (for parallel and distributed systems). Experiments with these and similar kinds of extensions are on the way.

For solving the problem of formalization, almost the same difficulties as in traditional requirements engineering have to be faced. Therefore, we suggest an approach to formalization which basically builds on experiences gained there, but also takes our envisaged enhanced version of an algebraic specification formalism into account. This will be the topic of the following section.

As to the matter of reading and understanding, attempts to provide understandability through translation of formal specifications into informal representation such as natural language text (cf. [Bauer 81], [Swartout 82], [Elder 85]), or graphics, seem to be promising, allowing even inspection by people without formal training.

4 T h e P roces s of Formal izat ion

Formalization is the process in which an informally given problem is turned into a formal problem specification. As mentioned earlier (cf. section 2.4), this process generally comprises at least three essential sub-activities, viz.

• identification of the problem,

• formal description of the problem, and

• analysis of the formal problem description.

In the following subsections we will focus on each of these subtasks in turn. Some of the aspects mentioned in section 2.4 will be worked out in more detail, with an emphasis on formal specification.

4.1 P r o b l e m I d e n t i f i c a t i o n

Problem identification means finding out what the problem is. The difficulties here mainly originate in the ambiguities and sources of misunderstanding inherent to the communication of different people by means of some informal language. Usually, the person who states the problem is not the one who is to describe it formally; additionally, due to different educational and professional backgrounds, they do not speak the s~me language. Therefore, problem identification involves a mapping from one universe of discourse onto another, and the essential activity in problem identification concentrates on characterizing the universe of discourse in finding this mapping.

Usually a problem statement (implicitly) assumes basic knowledge about its context, the problem domain. To get a correct evaluation of the problem it is essential to make these implicit assumptions explicit, i.e., to first identify the characteristics of the problem domain (cf. "domain theory" [Smith Lowry 89]). Having done so, further steps in finding the above-mentioned mapping are

• the choice of a concept to describe the problem domain, with a suitable representation, and

• the definition of the problem in terms of the concept.

Following [Webster 74] we will use the notion concept for ~an idea or thought, especially a generalized idea of a class of objects; abstract notion". Hence, a concept of a (given) problem domain is an abstract view of the problem domain, free from irrelevant details, but suited to reflect its essential characteristics.

As we are concentrating on software systems, we can further rule out arbitrary technical concepts, needed in integrated technology as, e.g., process control, and focus our attention onto concepts from mathematics.

In order to illustrate our notion of a (mathematical) concept, we consider our Swiss system example again. The problem domain here is, among others, comprised of basic entities such as players and games, and rounds, combining to form a tournament. Thus, in a simplified view, a tournament is a structure consisting of players and games connecting them. One straightforward concept for this structure is an undirected finite graph. A finite directed graph is also a plausible concept, wherein the direction of an edge might be used to encode the colour allocation in the game, e.g., by pointing from white to black. This example can be pursued further by taking the concept of an edge labeled graph. The label associated with each edge could be used to encode the round number or the result of the game encoded by this edge.

Further examples of mathematical concepts are:

17

• sets, relations, mappings, functions (a round may be considered as a set of pairs of players; the pairing cards as a mapping of the players to various information associated with them),

• orderings and lattice structures (e.g., in the Swiss system example the set of players may be given a partial ordering according to their rating),

• algebraic structures (e.g.~ groups, rings, fields, sequences, bags, trees),

• relational structures (e.g., different kinds of graphs, Petri nets),

• formal systems (e.g., equational systems, grammars, automata, rewrite systems, deduction systems, systems of concurrent processes),

• differential equations, but also

• stochastic models, or

• topological and geometric structures.

The choice of a suitable concept already entails a tremendous gain with respect to precision, as the possibilities for misunderstandings and misinterpretations are restricted. Frequently, in addition, the choice of a concept even amounts to a solution of the problem, as certain tasks for certain concepts are already generally formalized or solved. Examples of this kind are:

• minima, maxima, (topological) sorting, or totalization in orderings, e.g., rule 2 describes the totalization of the rating ordering,

• construction and modification of particular algebraic structures,

• paths, cycles, or closures in relational structures, e.g., looking at the undirected graph representation of a tournament, the pairing problem can be formulated as: to find a list of pairs containing all nodes (players) once, such that no pair is connected in the current graph (no previous game exists),

• fixed points or zero valued arguments for equational systems,

• languages generated by grammars or accepted by automata,

• confluence and Church-Rosser properties of rewrite systems,

• deadlock or starvation in systems of concurrent processes, or

• congruence, similarity, and translation for geometric objects.

There is a lot of freedom in choosing a concept. Only in rare cases a concept is obvious or straightforward, because of concrete hints that can be found in the informal problem description. In our example a hint is, e.g., provided in rule 1, where a "pairing card", containing all necessary information, is associated to every player.

However, generally no such hints are available. Therefore, the choice of an adequate concept requires decisions with far-reaching consequences. Thus, not only the level of abstraction and the complexity of the formalization of the problem are affected, but later solutions to the problem are also enormously influenced. As a consequence, choosing an adequate concept is to be considered an art that requires great care, intuition and experience.

In general, a concept consists of:

• objects associated with certain object classes, e.g., "pairing cards",

• operations on the object classes, e.g., scoring, and

• relations between objects and/or object classes, e.g., "games from previous rounds" forms a relation in the domain of "pairing cards".

Since we did not assume any priorities among these constituents, this fairly general characterization of a concept comprises more restricted ones (to be found in various parts of the literature) that reflect certain "views" of a problem such as

18

• function oriented,

• data structure oriented,

• event oriented,

• control flow oriented, or

• data flow oriented.

E.g., in the Swiss system one could view the concept of making a pairing as a function from a list of rounds played and the player cards to a new round (function oriented). Another view is to consider a tournament as a tree of pairings (data structure oriented). A further point of view treats each player as a process in a concurrent system looking for a next pairing in ease of a finished game (event oriented). These simple illustrations may give a rough impression of the problems concerned with choosing the right view.

In representing a concept one has to deal with a more detailed description of its constituents. Since there may be several representations of the same concept, again, a lot of freedom is provided here which involves further decisions.

The concept "finite directed graph", which we used in connection with our sample problem admits several (equivalent) descriptions. A general finite directed graph, such as:

. 1 v

v

can be defined as, e.g.,

(a) a set of nodes and a set of edges (represented by pairs of nodes):

({1, 2, 3, 4}, {(1, 2), (2, 2), (2, 3), (3, 2), (3, 4), (4, 2)});

(b) a set of nodes and a pair of incidence functions i and o which associate to each node the set of its predecessors and successors, resp.:

({1, 2, 3, 4}, i: 1 --+0 ,o: 1 --+{2} 2 -~ {i, 2, 3, 4} 2 -~ {2,3) 3 --* {2} 3 --* {2,4} 4 -~{3} 4 -~{2}

(c) an adjacency matrix where component (i, j) otherwise:

has the value 1, if there is an edge from i to j, and 0

1 2 3 4 1 0 1 0 0 2 0 1 1 0 3 0 1 0 1 4 0 1 0 0

Of course, the possibilities are not exhausted. However, it is obvious tha t the choice here could affect further developments in a significant way.

Having decided on a concept of the problem domain and a representation of the chosen concept, it remains to define the problem in terms of the representation of the concept, which, again, entails decision making.

] 9

If, for example, we decided on definition (b) above, we still would have to decide on the association of players and games with nodes and edges (the latter represented by incidence functions). One obvious possibility is to associate players with nodes, and games with edges. However, we also might associate both games and players with nodes, the former having two outgoing edges, one to white and one to black.

Which of several possible representations to choose, of course depends on further details of the problem to be solved. Thus, e.g., in the first association (i.e., players as nodes, games as edges), it is easy to check whether two players a and b have met before (in the terms of (b) above: either a e i(b) or b e i(a)). However, a list of all games is difficult to produce. The second representation, on the other hand, gives easy access to individual games, but, for example, checking whether a and b have met is more difficult.

Other examples that illustrate the choice of possible concepts and their dependence on further details of the problem are:

• concept: text,

representations:

- sequence of characters (e.g., for a scanner),

- sequence of words (e.g., for a parser),

- sequence of sentences (e.g., for a translation program),

- sequence of fines (e.g., for a line-oriented editor),

- tree of chapters, sections, etc. (e.g., for the retrieval of indexed terms);

• concept: mathematical formula,

representations:

- string (e.g., for simple text processing),

- tree (e.g., for evaluation or advanced typesetting).

In Section 2.2 we already commented on the distinguished role of functional requirements. This distinction becomes even more obvious with respect to formalization: the set of potential concepts is primarily determined by the functional requirements, whereas the choice among the members of this set, the choice of a representation of the selected member, and the choice on how to formally specify the problem in terms of the representation of the concept also takes non-functional requirements into account.

4.2 P r o b l e m D e s c r i p t i o n

If a problem has been identified properly, its (formal) description amounts to translating the result of the identification process into constructs available in the formal specification language. In particular, this means

• mapping the representation of the concept of the problem domain onto available constructs, and

• giving an expression in the formal specification language that describes the task to be fulfilled in terms of the representation of the concept.

In the Swiss system example one needs a representation for the players, together with some of their characteristics, like "name" and "rating ~ (cf. rule 2). The representation of functions and relations like "rating order" of course depends on the choice of the representation. Possibilities for representation are, e.g., a set of players with functions for every property, or an array (list, set) of tuples, one for each player, containing all relevant characteristics. More formally, in the first representation one has a set of players P, and some functions like:

r a t i n g : P --* NATURAL n a m e : P --r STRING of CHAR

order_no: P --. NATURAL,

while in the second representation one has an ARRAY of PLAYER p, where

PLAYER = TUPLE( n a m e : S T R I N G of C H A R ra t ing : NATURAL

20

The definition of "rating order" now depends on the relevant entry in the tuple, or the relevant rating function, with some auxiliary entry or function to decide ties unambiguously. In a list or array representation these ties could be decided by the order in the list (array), or a new function or entry could be introduced as a totalization of the partial rating ordering. So in the first example one has a function

defined by:

higher_rated : P x P -~ BOOLEAN,

higher . ra ted(a , b) -- ra t ing(a) > rat ing(b) v (rat ing(a) = rat ing(b) A order_no(a) < order_no(b)),

while in the second example the function looks like:

higher_rated : NATURAL x NATURAL x ARRAY of PLAYER ~ BOOLEAN,

now defined (disregarding border conditions on p) by:

h igher_ra ted(a ,b ,p )= p[a].rating > p[b].rating V (p[a].rating = p[b].rating A a < b),

or even, if p is sorted on rating: higher_rated(a, b,p) = a < b.

Similarly, previously played games could be viewed as a graph, or as a llst of pairs of players, or as a list of rounds, which in turn is a list of pairs of players, etc. A graph G = (V, E) is described by a set of vertices V and a set of edges E. The graph representation might then look like (P, G) with P again the set of players and G the set of games between players from P. A game can be represented by a directed edge (a,b) for a,b e P ("a played with the white pieces versus b ' ) . A function h a v e . m e t : P x P --* BOOLEAN is then easily defined as follows:

have_met(a, b) = 3g E G : (g = (a, b) V g = (b, a)).

A more round oriented view of a tournament might contain a set of rounds {Ri : 1 < i < maxround} , wherein every Ri C P x P. This allows the extension of function have_met with a round number to have_met : P x P x N A T U R A L -* BOOLEAN which could be defined as:

have.met(a,b,r) = 3i e O . . . r } : [3g • R~: (g = (a,b) V g = (b,a))],

or, in a recursive Way, as:

have_met(a, b, O) = FALSE have_met (a ,b , r ) = [3g • Rr : (g = (a,b) V g = (b,a))] V

have_met(a, b, r - 1).

The addition of these two predicates have_met allows the expression of the most important property of a valid new round, i.e., every player gets a new opponent. In our graph version this looks like (R being a set of edges):

valid(R) = V(x, ~) • R : -ha~e_met(~ , y) ^

1RI : L~J ^ [P - {x • P : (3y • P : ((x, y) • R v (y, x) • R))}I < 1,

or in words: no pair in R has met yet, R is half the size of the set of players P, and at most 1 player in P is not included in the pairing. The same can be expressed just as easily in the round version as follows:

.aZid(Rk) = V(x,~) • Rk : - h a ~ e _ m e t ( x , y , k - I) ^

IR~I = L~J A I P - {x • P - ( 3 y • P . ((~,y) e Rk V (y,.~) e R~,))}I < 1.

Note tha t the lat ter version of va l id allows us to check whether the previous rounds of the tournament have been entered correctly so far by defining a function val id ~ as follows:

valid '(R~) = ¥i • {1 . . . k} : valid(Pq).

21

Of course, the graph representation does not provide such a check, since information on the round in which the game is played is lost. This could be solved by maxking the edges with round numbers.

While pushing our straightforward formalization further, an ambiguity has been discovered in rule 5 (the Basic laws of the Swiss system). Rule 5b states that the number of players paired with a differently scoring player should be minimal and rule 5c that the difference between all scores in pairs should be minimal. If d : P x P --~ RATIONAL gives the difference in score for each pair, both requirements can be included in the definition of the predicate good, which should be true for an optimal pairing:

good(R) - valid(R) ̂ VR' C P x P : (valid(R') ~ I{(x,y) E R : d(x,y) # 0}1 < l{(x,y) E R t : d(x,y) # 0)l ) A VR' C P X P : (valid(R') ~ EO,,u)~Rd(x,y) < E(~,~,)en, d(z,y)).

Now, suppose at a certain stage the top score group has three players, a through c, and the next group two players, d and e. Say, c is selected as odd man and a plays b; e has already played d and e, so they play each other, and e plays someone, say f , two groups below. However, if b is selected as odd man a plays c, b plays d (or e) and e (or d) plays someone from the group containing f . Convention has it that the latter pairing is preferable, but the former has more players playing someone with the same score (rule 51)), while the pairing of e is the best possible according to rule 5c.

Similarly to other sub-activities of formalization, decisions are necessary here, too, depending on the particular specification language. Whereas translation of the representation of the concept into available language constructs in most cases will be straightforward, the formulation of the problem proper as an expression in the specification language usually again leaves a lot of freedom.

None of the decisions to be taken during the formalization process is unique, as we tried to illustrate by the simple examples above. Therefore a prime concern of any formalism for formal specification of problems is the provision of as much flexibility as possible in order to allow the adequate formulation of many possible representations of a variety of different concepts. Ideally, there should be a one-to-one correspondence between concepts and constructs.

At least, however, any formalism for the formal specification of some task has to offer constructs that allow the representation of the constituents of a concept, i.e., objects and object classes, operations, and relations, and the formulation of expressions that reflect that task.

Conventional programming languages allow the definition of objects and object classes (called "modes" in ALGOL and "types" in Pascal), operations and relations (by means of function and procedure declarations), as well as arbitrary expressions. Therefore, programming languages are to be considered as specification languages, too.

However, traditional programming languages only allow the formulation of determinate, operational specifications. Likewise, new object classes can be defined only in a constructive, hence operational, way. Additionally, not all constructs offered by programming languages are reaily suited for problem specifications, as some of them, such as statements, procedures, loops, or pointers, are too implementation- or even machine- oriented. Hence, their use in problem specification would lead to too "low" a level of abstraction.

Consequently, a suitable specification language will contain only those constructs of traditional programming languages, such as function declarations or expressions, that are appropriate for formulating problem specifications on a rather "high" level of abstraction. Additionally, in order to overcome the above-mentioned restrictions to determinate, operational specifications, further constructs have to be provided for, like:

• formulating indeterminate specifications (e.g., it is not a good idea to replace the random allocation of the rating order (rule 0), by the order of entry, since this puts a premium on entering late because a weaker opponent is to be expected according to rule 6),

• expressing descriptive specifications (e.g., the note to rule 9b, which leaves to the tournament director some tricky and tedious but rather irrelevant details), and

• defining object classes in a non-operational way (e.g., the description of "pairing card" in rule 1).

4.3 Analysis of the Problem Description

A specification is called a formal specification, if it is formulated in a formallanguage, i.e., a language whose syntax and semantics are explicitly established prior to its use. Thus, obviously, formal specifications entail

22

the usual problems of "formal correctness" to be encountered when using a formal language, viz. correctness with respect to syntax and context conditions, that have to be checked before starting semantic analysis or even program development.

The "meaning" of a formal specification is defined by the semantics of the specification language used. Usually this is a partial mapping from syntactic constructs to (sets of) semantic values. On this basis additional practically important semantic properties of formal specifications can be introduced such as

• defined (also consistent or satisfiable)

A formal specification is called defined if it has a "non-empty meaning", i.e., if there is at least one semantic value associated with the specified problem; otherwise it is called undefined (or inconsistent).

• determinate

A formal specification is called determinate if there is at most one semantic value associated with to the specified problem; otherwise it is called ambiguous.

• redundant

A formal specification is called redundant if there exists a semantically equivalent specification which is "simpler".

Except for simplicity, these properties can be formally checked on the basis of the semantics of the specification language. There are, however, additional properties that are not formally verifiable. These properties characterize the relationship between the meaning of the formal specification and the originally intended problem. Examples of such properties are:

• adequate

A formal specification is called adequate if its meaning coincides exactly with the original problem.

• overspecified

A formal specification is overspecified if its meaning comprises not all of the solutions to the original problem.

• underspecified

A formal specification is underspecified if its meaning comprises all solutions to the original problem and additional ones. Thus, in particular, an ambiguous formal specification is underspecified if the original problem is uniquely sovable.

Obviously, these latter properties are not independent of each other: an adequate specification is neither over- nor underspecified, but inadequacy does not necessarily imply over- or underspecification.

It is important to be aware of the above-mentioned additional problems, and checking the respective properties of a formal specification is an essential part of the formalization process. The process of formalizing a problem may only be considered finished, when the formal specification is syntactically correct, and its adequacy with respect to the originally given problem is ensured. For practical reasons, an analysis with respect to redundancy seems worthwhile, too.

Obviously, there is a causal relationship between the expressiveness of a specification language and the amount of effort that is to be spent for ensuring the adequacy of a formal specification. The fewer constructs a language offers, the "longer", and thus the more "complex", expressions describing the problem will be. Consequently, the relationship between the formal specification and the originally given problem will be tess obvious, and thus, more difficulties will be encountered when reasoning about adequacy.

Adequacy is the ultimate goal to be achieved. In order to reach it, an analysis with respect to the semantic properties seems worthwhile, because it gives valuable information. Thus, for example, recognizing a formal specification to be undefined usually indicates a defect in the formalization process rather than unsolvability of the originally given problem. Likewise, an indeterminate formal specification of a problem which is known to have a unique solution implies inadequacy. Also, an examination of the specification with respect to overspecification and underspecification provides valuable insight w.r.t, adequacy. Very often, underspecification can be removed by simply adding further conditions. Similarly, overspecification frequently can be eliminated by weakening certain restrictions. However, checking these properties is not sufficient. Further considerations with respect to adequacy are necessary, which, again, may lead to redoing (parts of) the formalization process.

23

4.4 S t r u c t u r i n g

So far we did not pay any attention to the size of the problems to be specified. In fact, we even assumed that the formalization process as introduced in the previous subsections is not affected by problems in managing complexity mainly originating from the size of some task. In practice, however, size is a problem, and mastering the resulting complexity by introducing a suitable structure is an essential part of the formalization process. Also, the specification itself has to be built in a structured way.

In principle, there are two strategies for introducing structure: proceeding top-down or bottom-up. Top-down proceeding is an iterated process that starts with the task as a whole and tries to split it up

into smaller sub-tasks which in turn are subject to further decomposition. This process ends, if a suitable level of refinement is reached.

Technically, each step in a top-down proceeding consists of two alternating activities: "decomposition" and "elaboration".

Decomposition means "to break up or separate into basic components or parts" [Webster 74]. This includes identification of the parts, a clear statement on their respective interrelation, as well as the formulation of the original task in terms of the newly introduced components.

Elaboration means "to work out carefully; develop in great detail" [Webster 74]. Elaboration aims at providing meaning for the parts introduced in decomposition. This may be done by either referring to existing basic concepts or by initiating another decomposition step.

Within the framework of algebraic specification the combination of decomposition and elaboration just described amounts to introducing a new type. Decomposition roughly corresponds to introducing the signature of a type (i.e. the syntactic part) whereas elaboration aims at providing a semantics in the form of appropriate axioms for the object kinds and operations introduced by the preceding decomposition step.

Bottom-up proceeding is also an iterated process that starts from the details of a problem and aims at composing them into larger units (at a higher level of abstraction) until the level of the entire system is reached.

As with top-down proceeding, each step in bottom-up proceeding consists of two alternating sub- activities, viz. "composition" and "specialization".

Composition means "to put together; put in proper order or form" [Webster 74]. Composition comprises the introduction of new entities, as well as a precise statement on the components of this new entity and the way how they are to be combined in order to make up a whole.

Specialization means "1. to make special, specific, or particular; specify. 2. to direct toward or concentrate on a specific end" [Webster 74]. Usually, entities introduced by composition are too general for the particular task at hand. Specialization then tries to "adjust" these entities for the particular needs of the respective problem.

Within the framework of algebraic type specifications, bottom-up proceeding starts with a predefined coUection of basic types (e.g., for numbers, truth values, characters, etc.) and basic type schemes (e.g., sequences, sets, bags, maps). Composition then means the definition of new types using suitable operations ("type constructors"). By specialization all those operations that are not needed are skipped from the list of visible constituents ("hiding"). Additionally, specialization can also introduce further restrictions on the operations and types ("constraints").

Both top-down and bottom-up proceeding as introduced above are idealistic views. In practice, both approaches will be used on the background of previous experience which always influences the proceeding in the opposite direction. Thus, for example, top-down proceeding is usually influenced by the availability of predefined types and type schemes or by certain ideas on the low level representation. Likewise, in bottom- up proceeding, composition, and in particular specialization, always wilt be done with the ultimate goal, viz. the entire system, in mind.

5 Conc lus ions

An attempt has been made to highlight some of the aspects and problems to be encountered in formalizing informal requirements. We favour writing the requirements document directly in a formal language, since it allows for early checking for completeness and consistency of a specification which is lacking in more informal methods. If a mistake can be detected early, the cost of repair is known to be relatively low. If detected later all kinds of followup from such a mistake have to be corrected too.

24

We know that it will be difficult to get acceptance of this view from people working in the field, since for practical requirements engineering, aspects such as understandability or non-functional requirements have to be covered by formal approaches too. Therefore, further research in these directions has to be initiated.

The request from industry for a more rigorous approach is growing, but there is still a huge gap between ideas at universities on formal specification techniques and the day-to-day problems encountered in practice to be filled. To this end, further work is necessary in connection with:

• an integrated methodology, which provides sufficient guidelines for the practitioner,

• software support, e.g., in the form of tools to aid the process of formalization, or transformation systems for a safe transition from formal specifications to efficient programs, but above all

• knowledge transfer~ in other to make all these beautiful ideas less academic and more usable for the practitioner.

Much can be learned by studying classical software engineering techniques, especially in the field of non- functional requirements. The results there should be applied to provide valuable information about necessary extensions of current formal specification methods, and about methodical guidance and software support needed to aid in practical application of formal specification methods.

Acknowledgemen t s

The authors wish to thank H. Meijer, tt.J.M. Meijer, J. VSlker and N. VSlker for their careful reading and comments.

References

[Abrial et al. 79]

[Agresti 86]

[Backhouse 86]

[Balzer 81]

[Balzer et al. 83]

[Balzer Goldman 79]

[Bauer 81]

[Bauer et al. 89]

[Bergstra et at. 89]

[Bird Wadler 88]

Abrial, J.-R., Schuman S.A., Meyer, B.: Specification language. In: McKeag, R.M., MacNaughten, A.M. (eds.): On the construction of programs, Oxford Uni- versity Press, 1979.

Agresti, W.M. (ed.): New paradigms for software development. Washington, D.C.: IEEE Computer Society Press, 1986.

Backhouse, B.C.: Program construction and verification. London: Prentice-ttall, 1986.

Balzer, R.: Final report on GIST. Technical Report USC/ISI, Marina de1 they, 1981.

Batzer, R., Cheathara, T.E. Jr., Green, C.: Software technology in the 1990's: using a new paradigm. In: Computer, November 1983, pp. 39-45.

Balzer, R., Goldman, N.: Principles of good software specification and their implications for specification languages. In: Proc. Specifications of Reliable Software, Cambridge, Mass., 1979.

Bauer, F.L.: Programming as fulfillment of a contract. In: Henderson, P. (ed.): System design. Infoteeh State of the Art Report 9:6, pp. 165-174. Maidenhead: Pergamon Infotech Ltd., 1981.

Bauer, F.L., MSller, B, Partsch, H., Pepper, P.: Programming by formal reasoning - - computer-aided intuition-guided programming. In: IEIgE Transactions on Software Engineering, 15:2, 1989.

Bergstra, J.A., Heering, J., Klint P. (eds.): Algebraic specification. ACM Press Frontier Series. New York: Addison Wesley, 1989.

Bird, R.S., Wadler, P.L.: Introduction to functional programming. Hemel Hemps- tend: Prentice Hall, 1988.

25

[Broy 87]

[Burstall Goguen 80]

[Dijkstra 76]

[Dubois et al. 88]

[Ehler 85]

[Ehrig et al. 83]

[Fairley 85]

[Feather 86]

[Feijs et al. 87]

[Fey 861

[Futatsugi et al. 85]

[Gaudel 85]

[Gijssen Haggenburg 88]

[Goguen Tardo 77]

[Goguen Meseguer 81]

[Giles 81]

[Guttag Horning 83]

[Hehner et al. 86]

[Henderson 80]

Broy, M.: Predicative specifications for functional programs describing communicating networks. In: Information Processing Letters 25, pp. 93-101, 1987.

Burstall, R.M., Goguen, J.A.: Semantics of CLEAR, a specification language. In: Bjcrner, D. (ed.): Abstract software specification, pp. 292-332, Lecture Notes in Computer Science 86, Berlin: Springer, 1980.

Dijkstra, E.W.: A discipline of programming. Englewood Cliffs, N.J.: Prentice- Hall, 1976.

Dubols, E., Itagelstein, J., Rifaut, A.: Formal requirements engineering with EREA. Phllips Journal of Research 43:3/4, pp. 393-414, 1988.

Ehler, H.: Making formal specifications readable. Institut ffir Informatik, TU Miinchen, Report TUM-I8527, 1985.

Ehrig, H., Fey, W., Hansen, H.: ACT ONE - an algebraic specification language with two levels of semantics. TU Berlin, Technical Report 83-03, 1983.

Fairley, R.: Software engineerin 9 concepts. New York: McGraw-Hill~ 1985.

Feather, M.S.: A survey and classification of some program transformation approaches and techniques. In: Meertens, L.G.L.T. (ed.): Program specification and transformation. Proc. IFIP TC 2 Working Conference, Bad TSlz, April 15- 17, 1986. Amsterdam: North-Holland, 1987.

Feijs, L.M.G., Jonkers, H.B.M, Obbink, J.H., Koymans, C.P.J., Renardel de La- valette, G.R., Rodenburg, P.H.: A survey of the design language COLD. In: ES- PRIT '86: Results and Achievements, pp. 631-644. Amsterdam: North-Holland, 1987.

Fey, W.: Introduction to Algebraic Specification in ACT TWO. T.U. Berlin, Technical Report 86-13, 1986.

Futatsugi, K., Goguen J.A., Jouannaud, J.P., Meseguer, J.: Principles of OBJ2. In: Proc. 12th Ann. A CM Symp. on Principles of Programming Languages, pp. 52-66. ACM, 1985.

Gandel, M.C.: Toward structured algebraic specification. In: ESPRIT '85: Status Report of Continuing Work. Part I, pp. 493-510. Amsterdam: North-Holland, 1986.

Gijssen, G., Haggenburg, W.G.: gwitsers Systeem. Amsterdam: KNSB, 1988 (partially in Dutch).

Goguen, J.A., Tardo, J.: OBJ-0 preliminary users manual. University of Califor- nia at Los Angeles, Computer Science Department, 1977.

Goguen, J., Meseguer, J.: OBJ-1, a study in executable algebraic formal specifications. SttI International Technical Report, 1981.

Giles, D.: The science of programming. Berlin: Springer, 1981.

Guttag, J.V., Horuing, J.J.: Preliminary Report on the LARCH shared language. Technical Report CSL 83-6, Xerox, Palo Alto, 1983.

Hehner, E.C.R., Gupta, L.E., Malton, A.J.: Predicative Methodology. In: Acta Informatica 23, pp. 487-505, 1986.

Henderson, P.: Functional programming: application and implementation. Engle- wood Cliffs, N.J.: Prentice-Hall, 1980.

26

[Henderson 81]

[Herik 88]

[Huflmann 87]

[IEEE 77]

[IEEE 831

[Jones 80]

[Jonkers et al. 86]

[Ka~id 80]

[Kiihnel et al. 87]

[Lehman 80]

[London Feather 82]

[MSller 87]

[M511er et al. 88]

[Partsch 86]

[Partsch 87]

[Partsch 89]

[Partsch Laut 82]

[Partsch Steinbriiggen 83]

[Polya et al. 83]

Henderson, P.: System design: analysis. Infotech State of the Art Report 9:6, System design, pp. 5-163. Maidenhead: Pergamon Infotech Ltd., 1981.

Van den Herik, J: Computerschaak. In: Schakend Nederland 95:9, pp. 38-39, 1988 (in Dutch).

Huflmann, H.: RAP-2 User Manual. Universitgt Passau, Fachbereich Mathematik und Informatik, Technical Report, 1987.

Special Collection on Requirement Analysis. IEEE Transactions on Software En- gineering SE-3:1, pp. 2-84, 1977.

IEEE Standard Glossary of software engineering terminology. IEEE Standard 729, 1983.

Jones, C.B.: Software development: a rigorous approach. Engiewood Cliffs, N.J.: Prentice-Hall, 1980.

Jonkers, H.B.M., Koymans C.P.J., Renardel de Lavalette, G.R.: A semantic framework for the COLD-family of languages. Logic Group Preprint Series No. 9, Department of Philosophy, University of Utrecht, 1986.

Ka~id, B.M.: The Chess Competitor's Handbook. London: Badsford, 1980.

Kiihnel, B., Partsch, H., Reinshagen, K.P.: Requirements Engineering - - Ver- such einer Begriffskl£rung. In: Informatik-Spektrum 10:6, pp. 334-335, 1987 (in German).

Lehman, M.M.: Programs, life cycles, and laws of software evolution. In: Proc. IEEE 68:9, 1980.

London, P., Feather, M.S.: Implementing specification freedom. In: Science of Computer Programming 2, pp. 91-131, 1982.

MSUer, B.: Higher-order algebraic specifications. Habilitation thesis, Fakult£t ffir Mathematik und Informatik, T.U. M~inchen, 1987.

MSller, B., Tarlecki, A., Wirsing, M.: Algebraic specification with built-in domain constructions. In: Dauchet, M., Nivat, M. (eds.): CAAP '88. Lecture Notes in Computer Science 299, pp. 132-148, Berlin: Springer, 1988.

Partsch, H.: Algebraic requirements definition: a case study. Technology and Science of Informatics 5:1, pp. 21-36, 1986.

Partsch, H.: Requirements Engineering und Formalisierung - - Problematik, An- satz und erste Erfahrungen. In: Schmitz, P., Timm, M., Windfuhr, M. (eds.): Requirements Engineering '87, pp. 9-31. GMD-Studien 121, 1987 (in German).

Partsch, H.: Algebraic specification - - A step towards future software engineering. In: Proe. METEOR-Workshop, Passan, May 1987, Lecture Notes in Com- puter Science 394, Berlin: Springer, 1989.

Partsch, H., Laut, A.: From requirements to their formalization - - a case study on the stepwise develop~nent of algebraic specifications. In.: H. WSssner (ed.): Programmiersprachen und Programmentwicklung, 7. Fachtagung, Mfinchen 1982, pp. 117-132. Informatik-Fachberichte 53, Berlin: Springer, 1982.

Partsch, H., Steinbrfiggen, R.: Program transformation systems. In: ACM Com- puting Surveys 15, pp. 199-236, 1983.

Polya, G., Tarjan, R.E., Woods, D.R., Notes on Introductory Combinatorics. Basel, 1983.

27

[Roman 85]

[Rzepka Ohno 85]

[Smith Lowry 89]

[Swartout 82]

[Webster 74]

[Wirsing 83]

[~h 82]

[Yeh Zave 80]

Roman, G.-C.: A taxonomy of current issues in requirements engineering. In: IEEE Computer 18:4, pp. 14-23, 1985.

Rzepka, W.E., Ohno, Y.: Requirements engineering environments: softwa~re tools for modeling user needs. In: IEEE Computer, 18:4, pp. 9-12, 1985.

Smith, D.R., Lowry, M.J.: Algorithm theories and design tactics. In: Van de Snepscheut, J.L.A. (ed.): Prec. Mathematics of Program Construction, pp. 379- 398, Lecture Notes in Computer Science 375, Berlin: Springer, 1989.

Swaxtout, W.: GIST English generator. In: Proe. AAAI 82, August 1982.

Webster's New World Oietiona~. Second College Edition. Cleveland: William Collings & World Publishing, 1974.

Wirsing, M.: A Specification Language. Habilitation thesis, Fachbereich Mathe- matik und Informatik, T.U. Miinchen, 1983.

Yeh, R.T.: Requirements analysis - - a management perspective. In: Proc. COMPSAC '82, pp. 410-416, November 1982.

Yeh, R.T., Zave, P.: Specifying software requirements. In: Proc. IEEE 68:9, pp. 1077-1085, 1980.

ViewPoint Oriented Software Development: Methods and Viewpoints in Requirements Engineering

Anthony Finkelstein Michael Goedicke Jeff Kramer Celso Niskier Imperial College of Science, Technology & Medidne

(University of London)

Abstract

This paper outlines progress on: developing methods to support requirements formalisation; incremental development of formal specifications; tool support for requirements expression; modelling requirements elicltation. A central thread in this work the -concept of "ViewPoint"- is examined, motivated and systematically characterised. The implications for methods to support the construction of formal specifications are considered. A framework for further work is outlined.

1 Introduction

This paper outlines four distinct strands of research on software development methods in the context of requirements engineering. It attempts to show how these, apparently diverse, research approaches can be brought together using "ViewPoints". We examine this concept in more detail and develop it's implications.

Before doing this we must justify our initial focus on both requirements engineering and methods. We must also explain precisely what we mean by them.

Requirements engineering is the development of requirements specifications of substantial complexity and scale. It covers the activities by which goals, needs and concepts are acquired and documented. It includes the tasks often called elicitation and validation.

Software development methods are an integrated collection of work plan, representations and heuristics whose purpose is to guide and orgardse complex software development activities. The work plan is essentially a model of the underlying software process, it can be used to answer the question "what should I do next?". The representations are the means by which knowledge about the application domain is captured and documented. The heuristics are hints, tips or expertise about what to do in particular situations.

Our focus on requirements engineering is not difficult to justify. It is well known that the cost of eliminating a requirements specification error increases rapidly as we move towards implementation. In addition requirements engineering covers some of the least well understood (and supported) parts of the software development process.

Our focus on methods, as against a more traditional computer science focus on specification languages, may require more justification: methods are where software process and representation meet; methods bring together complementary representation schemes; methods are a source of packaged software development expertise; methods largely drive the selection of development tools; methods are a primary vehicle for

30

training and technology transfer in software development organisations. We hope that this paper will give more substance to this justification.

Our work is oriented towards the use of formal techniques in requirements engineering. That is the use of techniques with precise, unambiguous and analysable semantics, to represent and reason about the process of requirements engineering and to represent and reason within the products of requirements engineering. Rigot~r is a goal which, we believe, is realistically achievable but which may, on occasion, have to be subsumed within the larger goal of improved practice.

2

2.1

Method Support for Requirements Formallsation

Motivation

It is generally accepted that while formal requirements specification techniques are potentially beneficial there remains a substantial problem in scaling-up these techniques for industrial use. The problem is a multi-faceted one involving the specification languages themselves, the absence of development tools, the lack of a suitable support environment, the difficulties of validation, the tough nut of automated verification and the largely untackled area of technology transfer (Cunningham et al. 1985).

The most immediate difficulty is the absence of any method support for the complex task of requirements formalisation. It appears that software developers are just as inclined to "hack" (in the sense of undisciplined and undocumentable developmen0 in formal specification languages as they might be in C.

2.2 ~ogre~

The FOREST (Formal Requirements Specification Techniques) project is a collaborative project established under the UK Alvey Initiative and further supported by the Department of Trade & Industry Information Engineering Directorate. Its objectives are to provide support for the requirements specification of large real-time embedded systems. In the context of this project some progress has been made in developing and establishing a requirements formalisation method called, with tongue firmly in cheek, Struch~ed Common Sense. Additionally some lessons have been learned about methodology that is, in general, how such methods can be constructed and used.

The formal representation scheme developed by the FOREST project is called Modal Action Logic, commonly abbreviated as M[A]L (Khosla & Maibaum 1989). It is a many sorted first order logic enhanced with agents, actions, deontic operators (permission and obligation), temporal interval operators (overlaps, last_before etc.) and action combinators. A specification expressed in M[A]L has the following syntactic categories:

(a) Sorts of object and agent classes

Co) Functions and predicates

31

(c) Actions

(d) Constant and variable objects and agents

(e) Axioms which may define predicates and functions, actions and conditions of action occurrence.

Thus the process of formalising requirements in M[A]L requires the specifier to answer the following sorts of questions:

What agents comprise the system being specified?

Why those agents?

What actions do they perform?

What level of detail in action definition is desirable?

What are the definitions of these actions?

When can each action occur?

In the pre-conditions and post-conditions of the actions why are certain predicates and

functions used?

Why are those object classes significant?

What are their attributes and relationships?

Structured Common Sense (abbreviated as SCS) is an attempt to systematically guide and organise the process by which those questions are answered (Finkelstein & Potts 1987). The structure of SCS is similar to that of conventional information systems analysis methods. It consists of a number of distinct steps some of which are performed in parallel and some sequentially. Progress through SCS is driven by a work plan. Each step has associated with it intermediate graphic representations and heuristics.

It is easier to understand SCS by looking at a sample step. An important step of SCS is Action Tabulation. It's primary purpose is to support the specifier in identifying the actions that each agent performs. It is based on the Tabular Collection step of CORE (Controlled Requirements Expression) an established requirements specification method (Mullery 1985). The step involves drawing a table which, for each agent, shows the actions it performs, the data that the action consumes and produces, the sources and destinations of the data. We have been unabashed in drawing techniques from existing software development methods which allows us to take advantage of much of the expertise associated with these methods. From CORE we have taken:

graphic representations - Tabular Collection Forms (a format for the tables);

32

parts of the work plan - "connect column entries", "refine table", etc;

heuristics - "have you tried to minimise data flows internal to the table", etc.

On completion of the Action Tabulation step the specifier should have:

a set of Action Tables;

accompanying textual annotation;

the action declarations of the formal specification.

The following step of SCS is data modelling which is used to identify functions and predicates and which uses the data identified in Action Tabulation as a starting point.

The Action tabulation step has provided:

an intermediate representation which can be used for preliminary validation;

the removal of gross errors by means of consistency checks which can be performed

on the intermediate representation;

documentation to assist interpretation and understanding of the formal specification;

an incremental formalisation of system requirements;

a framework for the following steps in the work plan.

Other than the standard case studies (libraries, lifts, patient monitoring systems) our experience of SCS is limited to several case studies carried out by our industrial collaborators. These include a small command and control system. Some larger case studies (railway signaling and nuclear power plant control) are now in progress. These are being carried out by groups not directly involved in the development of the techniques themselves. This is giving us useful feedback for improving the method and valuable experience in technology transfer and training.

The experience of developing and using SCS has shown that requirements formalisation is amenable to, and gains significantly from, method support. However constructing methods (methodology) is still largely ad-hoc. We have learned that it is extremely difficult to graft an entire existing informal or formatted requirements engineering method onto an established formal representation scheme as both may deploy widely different conceptual categories for their analyses. A goal-directed approach is preferred which is based on an analysis of the properties of a target formal representation scheme and development of a method which is optimised for this formal system. The questions raised in this research are: what tool support is appropriate for a method of this form; can we develop a more principled understanding of requirements engineering activities notably requirements elicitation; is there a more systematic way of constructing methods?

33

3

3.1

Incremental Development of Formal Specifications

Motivation

It is a common observation that an incremental approach to specification, in which development is done in small increments or steps that address different aspects of the software product, is superior to what might be termed the traditional or '~big-bang" approach.

It follows that it is necessary to develop the right structure for the presentation of specifications so that they meet the requirements of incremental development. We suggest that a collection of related but partial specifications - views - form a good basis for incremental development. Views address the specification task by providing suitable abstractions of the entire problem area in each view. The entire specification is then obtained by a superposition of the related views. Each view employs its own representation scheme to specify the properties captured within it.

Views provide a systematic means for combining different representation schemes and using them to support incremental development.

3.2 Progress

Two projects PEACOCK, funded by the EC Esprit Initiative, and PRISMA have been developing the view concept in the context of specification. Neither project tackles requirements specification directly but both have substantial implications for it's practice.

3.2.1 PEACOCK

The approach developed within PEACOCK is called l'[ (Goedicke, Ditt & Schippers 1989, Goedicke 1989). It has developed a concept of module, CEM (Concurrently Executable Module) and its associated objects, whose properties are presented using views. A system in built from a hierarchy of these l-[ objects. The properties of a class of objects are determined by a CEM specification. The l'[ language allows the developer to describe a system in terms of a number of single, isolated specifications each forming a component. A "configuration" of such specifications then specifies the properties of an entire system.

Underpinning the component concept in FI is the idea of data encapsulated by operations which are the only means to access that data. The actual data is contained in objects. The corresponding CEM specifications state the operations which can access and/or modify the object's encapsulated data. The developer is interested in expressing:

the effect of an operation on data;

how the effect is actually computed;

how to control the sequence of operation executions resulting from potentially parallel invocations.

34

Corresponding to each of these aspects the 1] approach provides a view which can be used to specify the properties of CEMs in isolation. This is explained in more detail below.

By concentrating on how a certain effect is computed by an operation and abstracting away from the possible invocation sequence of operations and representation/implementation details we get an "Imperative View"

The properties of operations stated in this view are:

thread of control;

specification of whether an object is only inspected or possibly also manipulated ("side effect").

The "Concurrency View" lets the developer consider the permitted execution orderings which can be offered by an object of a CEM to other objects. In this view the permitted sequences of operation executions, which do not damage the internal consistency of an object, are described. This specification is done by path expressions using an interpretation for modular systems.

Both the two views above consider the execution of operations. In the early stages and/or when analysing a CEM specification it is important to consider the execution invariant properties of operations. By abstracting away from execution one gets a system description which states only the effects of an operation execution. This can be contrasted with the way such an effect is computed.This leads us to a "Type View" where the effect of an operation execution is described in terms of (side effect free) operations. The notation used for this purpose is that of algebraic specification. The properties of the sorts and operations are described in terms of equations.

In order to describe entire systems as organised collections of CEM specifications so-called system (wide) views are used. The most important is the "Connection View" which uses the concept of a CEM Configuration. This view describes how various CEM specifications are connected together and which kind of object configurations are derivable from the connected CEM specifications. These derived properties include creation, sharing and initializing of objects which cannot be conveniently expressed in an isolated CEM specification.

There are rules which govern the relations between these views and which try to ensure consistency. For example, the effect of an operation, specified in the type view, must not conflict with those specified in the imperative view. Another example might be that the functional dependencies, implied by the type view, must comply with the ordering of executions stated in the concurrency view.

Given such rules, how can incremental development be supported? In the N-language we separate the incremental development of a system architecture from the incremental development of each CEM specification. This approach is similar to that used in the CONIC toolkit (Kramer & Magee 1985) where configurations of logical (processing) nodes are specified independently of their implementation.

35

The views of a CEM specification allow to development of a component description incrementally along three dimensions:

increasing coverage, e.g. by adding another operation to a CEM specification we cover more aspects of that CEM;

increasing detail, e.g. by showing how the operations on a CEMs type view are mapped to operations in the imperative view that manipulate the CEM's objects we

are adding the detail on how the operations will affect the state of those objects;

increasing precision, e.g. by adding equations to the signature of an abstract data type

we move from an informal idea of what the operations on the type mean, conveyed

by the names of the operations, to a more precise definition captured by the equations.

3.2.2 PRISMA

The PRISMA project has also concentrated on a "pluralistic" approach to software specification (Niskier, Maibaum & Schwabe 1989). It has developed a framework and tool support within which many representation schemes and associated heuristics can be integrated. In particular it has examined the well-known families of software specification methods generally termed Structured Analysis, Behaviour Analysis and Data Analysis.

The choice of Structured Analysis, Data Analysis and Behavioural Analysis as test-cases was based on the following criteria:

each of them focuses on a different, and complementary, aspect or "view" of the application domain;

they represent well-known and widely used approaches, this adds both to the quantity

and quality of the heuristics available;

all three approaches make use of graphical representations - data flow diagrams, entity -relationship models and Petri nets - to express knowledge about the system.

For each approach we have acquired, from textbooks and interviews with practitioners, heuristics for identification of concepts - "identification heuristics", structuring - "structuring heuristics" and validation - "validation heuristics" of the specification. For each combination of two approaches we have identified "complementarity" heuristics for checking consistency between them (Niskier & Maibaum 1989).

In determining which concepts should appear in a software specification, identification heuristics are useful. They act as filters over the multitude of possible choices from the application domain. Identifying concepts is a critical step as, once they are identified and used in building the description, the boundaries of the specification are set and difficult to change.

Structuring heuristics capture past experience in the application of the method to specific problems. They make use of syntactic properties to characterise unsatisfactory situations in

36

a specification such as those caused by minor local inconsistencies and provide advice on how these can be overcome.

Validation heuristics define "interesting" properties of specifications in the form of natural language paraphrases. A specification may be syntactically correct, but may not have some desired property, or may have some undesired property, indicating a possible misunderstanding. The validation heuristics suggest how the specification can be "read", pinpointing such problems.

Complementarity heuristics are the most important for a "multiple view" specification. They act as guide-lines in assuring the joint consistency of different descriptions and suggest ways of verifying that some properties in one view are correctly represented in other views.

The PRISMA tool set provides knowledge-based expert assistance in the construction of a software specification. It provides the user with three sets of tools for each view: the agenda, the paraphraser and the complementarity checker.

The agenda mechanism delivers the advice-giving heuristics. At any stage, during a view construction, the user-specifier can, through a tasks window, ask questions about the remaining tasks to be performed and be provided with an advice window suggesting how to accomplish each task. The paraphraser makes use of the set of validation heuristics to generate sentences in natural language. A paraphrases window is produced whenever the user wants to check the current state of a view in order to see if its properties satisfy his intuitions. If the user-specifier is temporarily satisfied with the contents of the current view, a new view can be selected.

Once a new view is selected, the control mechanism automatically switches to the complementarity checker, which presents a set of "hints", in a complementarity checks window, warning of relations between properties of the current view and properties in the new view. This tool can be used in two ways, as an aid in constructing a new view from the previous one and as support for checking consistency between two parts of a specification.

PRISMA has been tested on many medium sized case studies. The results have been satisfactory, giving us confidence that the combination of multiple representations with effective heuristics provides a useful basis for incremental acquisition of software specifications. We believe that the PRISMA model can be easily extended.

4

4.1

Tool Support for Requirements Engineering

Motivation

Given that we can construct suitable requirements engineering methods how, precisely, do we support these with automated tools? The straightforward answer is that we need diagram editing and consistency checking of the type currently embodied in CASE tools. This however only supports a part of what we understand by methods namely the

37

deployment of the representation schemes. Much more challenging, from a research perspective is how we support the work plan and the heuristics.

4.2 Progress

The TARA (Tool Assisted Requirements Analysis) project is a collaborative research project supported by the Rome Air Development Centre of the United States Air Force. It's objective is to examine how tool support for requirements engineering can be extended (Kramer et al 1987). Specifically it has been working in three areas:

tool support for animation of specifications;

tool support for reuse of requirements specifications;

tool support for method guidance.

The TARA approach is that these issues are best investigated in the context of mature requirements specification techniques and reasonably stable tool support base. Put baldly we were reluctant to spend time creating the underlying graphic tools or providing support for a method (such as SCS) which was as yet unproven. Consequently we adopted the widely used formatted requirements method CORE and The Analyst (Stephens & Whitehead 1985), a CORE support environment, combining diagram construction/editing, basic consistency checking. They represent what we regard as the established "state-of-the- art" in requirements analysis and form a good point from which to ask, and hopefully answer, the important questions about extending tool support for requirements engineering. The progress we have made has, we believe, application equally to formal and formatted requirements engineering techniques.

From the standpoint of this paper we will concentrate on method guidance however, a brief look at at animation and reuse is worthwhile.

An animator has been developed which provides facilities for the selection and execution of a transaction to reflect the specified behaviour given a particular scenario. Actions are described in terms of input-output relations. Simple rules can be specified to control the execution of actions. Facilities are provided to replay and interact with transactions. Reuse is supported by tools for identifying candidate transactions from a reuse database. The search strategies provided include browsing in an inheritance structure, different levels of pattern matching, causal chain matching (matching of the underlying control structures), and purpose matching. Support is then provided for the allocation of the selected fragment to the target environment.

Method guidance was based on a formal model of the CORE work plan. This model describes in detail the sequence of method steps that should be followed and the heuristics associated with particular steps. The model is directly interpreted by the TARA tools to provide advice and reasoning. It is used in conjunction with the rules used for consistency checking to provide remedial advice. This advice is provided directly that is in the context of the particular task the user is engaged in. The model can be tailored to accommodate different styles of method use and the advice can be tailored to accommodate different groups of users. Both reuse and animation tools are integrated within the common method guidance framework.

38

An example of the use of such method guidance might be where a new object has been introduced into a specification at a late stage. Initially inconsistencies with earlier steps would be notified to the developer. If these where ignored they would be flagged, by posting a note (analogous to the use of small yellow post-it notes on paper) on the diagram or object, and the developer would be informed of the best stage to correct these inconsistencies. Priorities associated with each further step that can be taken by the developer and consequences of taking alternative courses of action are presented in a graphic form. A variety of progress summaries and report formats are provided.

A "first cut" method model was built in Modal Action Logic (as discussed in 2.1 above). M[A]L has proved very good formalism in which to model methods. The model was used to evaluate and refine the description of the method. On completionthe model was hand translated into Prolog for direct execution within the tool set.

The TARA approach has been tested in conjunction with the Analyst workstation. A major case study, the ASE (Advanced Sensor Exploitation) test environment, has been analysed and specified using CORE, the Analyst, and the tools we have deveIoped. The TARA work has developed a better understanding of how methods can be supported by tools and how these methods can be modelled.

5

5.1

Modelling Requirements Elicitation

Motivation

The construction and use of improved methods to support requirements engineering depends on a better understanding of the underlying activities notably, requirements elicitation and it's flip-side requirements validation. Ideally such an understanding should be presented in the form of a model which is:

formal, that is constructed within representation schemes with precise, unambiguous and analysable semantics;

explicit, that is capable of being directly examined and manipulated;

enactable, that is executable, interpretable or amenable to automated reasoning.

The benefits of doing this are that the model can, over and above the improved understanding, be used as a basis for information system development environments (Kaiser, Feiler & Popovich 1988), in provision of active method guidance as in TARA above; in meta-programming and conducting development analysis (Kellner & Hansen 1989).

5.2 Progress

The objective of the IC~DC project is to model requirements elicitation more specifically requirements specification from multiple "points of view" (Finkelstein & Fuks 1989).

0 9

Requirements engineering is typically an activity in which there are many participants - clients, systems analysts, engineers, domain experts and so on. Each has differing perspectives on, and knowledge about, the object system, as well as a variety of skills, roles and so on. In some cases the perspectives may be based on underlying contradictions. To construct a requirements specification the participants must cooperate; that is, contribute to the achievement of a joint understanding - we term this specification from multiple points of view.

This contrasts with the approach taken by existing requirements engineering methods and tools which are generally based on specification from a single point of view and refined using examples that consolidate this weakness.

Our research goal within the IC~DC project is to develop a formal understanding of specification from multiple points of view so that we can both support the construction of formal requirements specifications and reason about the process of specification itself. Our aim is to encapsulate cooperative requirements elicitation strategies, "replay" these strategies (Wile 1983) and develop appropriate support and coordination tools. To do so we have taken what might be broadly termed an AI approach - we have sought to model the mechanisms which underlie the way people carry out the complex task of elicitation.

Our model of specification from multiple points of view treats the development of a requirements specification as a conversation in which the points of view, treated as agents, negotiate, establish responsibilities and cooperatively construct an overall specification. The model deploys some formal apparatus - dialogue logics - taken from work on the foundations of logic (Mackenzie 1981) and an approach - cooperation and negotiation - of work on distributed artificial intelligence.

The model is presented in the form of rules that:

describe the basis of a well formed conversation;

establish the relation between actions - locution acts - and attitudes - commitments;

define, syntactically, the form of reasoning permissible within the conversation and

common to the points of view.

Various development strategies such as simple refinement, transformation and verification can be described. A description of the conversation in which the specification is constructed can be replayed to generate an explanation of (or rationale for) the specification.

We have developed a detailed understanding of, and automated support for, cooperation and negotiation of two points of view and laid the ground work for N-party cooperation. The model has been validated by experience on a large number of small examples. It is important to emphasise however that our model is very sparse. By basing our work on formal models of argumentation there are practical limitations in both the underlying language and the conversational strategies we can capture.

40

This work is still largely in progress. Our immediate aim is to continue work revising and extending the model, including generalising dialogue strategies, with the long term objective of developing a full requirements engineering environment and supporting methods based on the ideas outlined above.

6

6.1

Concept of ViewPoint

Informal Introduction

In the sections above we have outlined research in which the concepts of view and more particularly of viewpoint have emerged as a central thread.

In the FOREST project we saw the need to find a better way of constructing methods for requirements formalisation. This lead us to think of the representations tied to each method step as, in database terms, providing a "view" on the domain or on the specification information. Requirements could then be seen as elicited or presented for validation through such views.

The notion of views as partial specifications and as the principled basis for incremental construction of specifications has been fully developed in the PEACOCK and PRISMA projects. These projects have convinced us of the importance of combining representations.

The TARA research has given us a considerable respect for the method CORE. CORE is based round the notion of viewpoints which are it's primary structuring vehicle. A CORE viewpoint is "something that does things" and with which a "viewpoint authority", that is a someone (or occasionally something - such as an operations manual) from whom information about the viewpoint can be elicited, is associated. A well known heuristic for identifying CORE viewpoints is "a viewpoint is something you can pretend to be", thus for example it is easy to imagine yourself as a lift passenger or scheduler while it is considerably less easy to imagine yourself as a floor.

The concept of point of view on which the IC~DC work is based has been carried over from the TARA work. The significant enhancement to the concept of viewpoint brought out by the IC~DC project is the idea of a point of view as a "software development participant", that is as an active, autonomous and loosely coupled agent - in the distributed artificial intelligence style.

Other influences on the approach we have adopted are those of "selfish views" (Robinson 1989) and "contexts" in ERAE (Finkelstein & Hagelstein 1989).

Our current work is concerned with trying to develop a systematic characterisation of this concept. Before giving such a characterisation we will give working definition which we will refine subsequently.

A ViewPoint is a loosely coupled, locally managed object which encapsulates

41

6.2

knowledge about the process of software development and knowledge about the

specification world.

Motivation

Because the concept of ViewPoints has largely grown out of ex2sting research work it is essential for us to provide a structured motivation.

This motivation has four parts:

unifying models of software process and models of software structure;

developing an overarching structural framework for software development which incorporates requirements engineering;

supporting the use of multiple representation schemes;

providing a systematic basis for constructing and presenting methods.

6.2.1 Combining Software Process and Software Structure

As has been stated above requirements engineering involves many participants, experts in various aspects of software development and in various aspects of the application area. In the same way, each participant may have different roles, responsibilities and concerns which may change and shift as the requirements develop and evolve. Participants have knowledge which they want to bring to bear on the development of the requirements specification. This knowledge will generally complement that of the other participants but may also overlap, interlock and conflict.

This faces us with two groups of closely related problems:

With all these participants how can we guide and organise the process of software development? How do we assign and maintain responsibilities?

How can we allow each participant to see only that aspect or part of the "specification world" which is relevant to that partic/pants interests and responsibilities? Following from this, how can we ensure that, if each participant uses a "bespoke" representation for eliciting, presenting and determining properties of relevant parts of the spedficafion world, potential inconsistencies and conflicts between different participants are noted and resolved?

These groups of problems are commonly treated separately - the first in so-called software process modelling languages, the second in specification language structuring schemes. We propose the use of ViewPoints as both an organising and a structuring principle in requirements engineering

6.2.2 Developing a Structural Framework for Software Development

A well known difficulty, which arises with all approaches to structuring in software

42

development, is that of "structural transformation". What appears an appropriate structure for carrying out requirements analysis is not suitable for design. What appears an appropriate structure for carrying out design is not suitable for construction and reuse and so on. We argue that ViewPoints provide a consistent structuring approach which accommodates all aspects of software development. In particular ViewPoints allow us to support the activities of requirements elicitation and formalisation which are generally ignored in conventional approaches to software development

6.2.3 Using Multiple Representations

Much effort has been devoted to developing ever richer and more sophisticated formal representation schemes. On the surface this appears to be a worthwhile enterprise - if a representation schemes is made more expressive the task of elicitation should, in theory, become easier. This has however not proved to be the case:

the learning overhead in the use of these schemes is significant;

the development of such schemes is extremely difficult, in particular developing

sound and adequate verification or proof schemes;

such schemes are often very different from the conventional (and reasonably well

understood schemes) used in software engineering practice and consequently pose difficulties for technology transfer;

the richer the representation scheme the easier it is, in principle, to write baroque and

unreadable descriptions;

a more expressive representation scheme makes validation of complex properties of a

description theoretically possible, for example, generation of consequences using

formal reasoning, but makes simple validation by inspection more difficult;

no single person may want, or be able to, use the full expressive power of the

representation scheme.

The alternative to the use of these more complex formal representation schemes is the use of many simpler representation schemes. This approach, which we prefer, requires that the way in which descriptions are constructed in these schemes, the way moves are made between schemes, the way in which the schemes are related and information managed requires careful coordination. This coordination can be carried out by ViewPoints.

6.2.4 Building Methods Systematically

It may be observed that there are dose parallels between our motivations above and what practitioners have sought to achieve in methods. Methods, in the strict sense of the term - the collection and packaging of software development knowledge, have commonly been overlooked in current computer science in favour of representation techniques or development processes and paradigms.

Methods crudely attempt to combine software process with software structure by breaking

43

down a "work plan" into steps and stages and associating these with elements in a (generally functional) decomposition. Methods aim at providing systematic coverage of software development activities. Methods provide organised collections of simple representation schemes which are closely related and provide guidance, integrated with a work plan, for moving between these schemes.

This close relation suggests that ViewPoints can be used as a structured means of presenting a method and managing method-derived information.

6.3 Characterisation

A more detailed description of ViewPoints can be divided into two parts:

the static structure of ViewPoints - the knowledge they encapsulate and the structure

of ViewPoint "configurations";

the dynamic structure of ViewPoints - how they evolve and relate to each other.

The static and dynamic structure are, to a certain extent, interdependent and it may be necessary in describing one to refer to the other. In this paper we will concentrate on the static structure.

A ViewPoint (we use the distinctive capitals to denote our interpretation) consists of the following parts:

a style, the representation scheme in which the ViewPoint expresses what it can see

(examples of styles are data flow analysis, entity-relationship-attribute modelling,

Petri nets, equational logic, and so on);

a domain defines which part of the "world" delineated in the style (given that the

style defines a structured representation) can be seen by the ViewPoint (for example, a

lift-control system would include domains such as user, lift and controller);

a specification, the statements expressed in the ViewPoint's style describing particular

domains;

a work plan, how and in what circumstances the contents of the specification can be changed;

a work record, an account of the current state of the development.

ViewPoints are organized in configurations which are collections of related ViewPoints. A ViewPoint template consists of a ViewPoint in which only the style and the work plan

have been defined. A method is a set of ViewPoint templates and their relationships, together with actions governing their construction and consistency.

44

An Outline of a Simple Example

To understand what this means in practice let us look at a small example in which we

outline a fragmentary ViewPoint configuration. Our example is based on a small library

system, details of which we will introduce as it becomes relevant. Our configuration consists of three ViewPoints LDS (library desk, state transition analysis), LDDF (library desk, data flow analysis) and US (library user, state transition analysis).

In order to explain these, we first develop two ViewPoint templates ST and DF which

allow us to conduct state transition analysis and data flow analysis respectively. These

templates will then be used to build a simple method (which we call NYCE). The method

will be deployed in the library example. We will leave the expansion of the example, say

into one where we have a fourth ViewPoint such as UD (library user, data flow analysis), as an exercise for the reader.

7.1 State Transition Analysis

In this sub-section we describe the ViewPoint template for the representation and development of a system (or part of a system) using state transition diagrams. It is simplified by excluding such features as hierarchical decomposition. Initially we outline

the style which gives us the language in which to capture states and transitions. We follow

this with an outline of the work plan which is described in terms of actions which may be

applied to state transition diagrams and axioms describing the relations between those

actions. We use the notation from M[A]L in that axioms are expressions of the form

P --> [a] Q. This should be read as 'if condition P holds, then after action a is performed

condition Q holds.

The representation scheme for state transition analysis is presented in terms of annotated

directed graphs. In the spedfications which follow we will, for clarity of exposition, use the

graphic representation.

ViewPoint Template Description: state transition analysis

Style given b y ST = (State, Trans)

45

[States State Set of Nodes State of symbols

(represented by circles) denoting a

state

Transitions Trans Set Trans of labelled Edges given by

E,L,T

Transition names T

Edges E

Labelling L

Work plan: actions

T is a set of symbols denoting

Transition names with T n State = O

with E c State x State

with L : E --) T

basic actions add_state, remove_state, add transition, remove_transition

basic work plan actions

heuristics

Work plan: axioms

identify_boundary_states, identify_internalstates,

identify_transitions, check_consistency

more_than one transition_per_state_pair?,

more_than_max_states?...

1) emptydiagram --) [identify_boundary_states] boundary_states_identified

2) boundary_states_identified ~ [identify_internalstates] all_states found

3) true ~ [identify_transitions] all_transitions_found

4) aILstates._found ^ all transitions_found --~ [checkconsistency] goLnice..ST_diagram v

ST inconsistencies

Table 1: Style and work plan for state transition analysis

7.2 Data F low Analys is

In a similar w a y to that described above, we can describe the ViewPoint t emp la t e for data

f low analysis. This is based on a s imple version of data f low d iagrams (excluding

ref inement and data dictionaries). The propert ies of a system (or par t of a system) are

described as a collection of functions, stores and terminals which are connected by data

flows.

The representation scheme for data f low analysis is presented in te rms of annota ted

directed graphs. In the specifications which follow we will, for clarity of exposit ion, use the

graphic representat ion.

ViewPoint T e m p l a t e Description: data f low analysis

46

Style given l~ r DF = (Term, Func, Store, Data flow)

Terminals Term Set of nodes Term of symbols (represented by square) denoting

a terminal node , , , , , ,,,

Fun~ons Func Set of nodes Func of symbols (represented by circles) denoting

function nodes

Stores Store

Data flows Data flow

Data flow names F

Edges E

Labelling L

Set of nodes Stores of symbols (represented by two parallel

horizontal lines) denoting data stores

Term, Func,Store pairwise disjoint and let the set N of graph

nodes N -- Term u Func u Stores

Set Data flow of labelled Edges given by E,L,F

F is a set of symbols denoting Data flow names with F r7 N =

w i t h E c N x N

w i t h L : E ~ F

Work plan: actions

basic actions


add_node(type), remove_node, adddataflow, removedata_flow

identify_terminal_nodes, identify function_nodes,

identify_data_store_nodes, identify_data flows, check_consistency

heuristics more_than_max functions? ...

Work plan: axioms

1) empty_diagram ~ [identify_terminal_nodes] terminalnodes_identified

2) empty_diagram ~ [identify function_nodes] function_nodes_identified

3) empty_diagram ~ [identifydata_storenodes] data_store_nodes_identified

4) true ~ [identify_data_flows] all_data flowsjound

5) terminal_nodes_identified A function_nodes_identified A data_store_nodes_identified A

all data_flowsfound ~ [check_consistency] ~ot_nice DF_dia~am v DF_inconsistencies

Table 2: Style and work plan for data flow analysis

7.3 The NYCE Method

We n o w develop the NYCE (Not Yet Completed Example) m e t h o d which consists of the

ViewPoint t empla tes ST and DF shown above. To do this we need to consider the relations

between ViewPoints of the same doma in bu t based on a different t empla te and the

relations between ViewPoints based on the same template but different d o m a i n (we need,

for example, to define the relations between two state transition analyses each representing

a different par t of the system). Figure I illustrates the relations which we will need to

define in our example.

47

Domain Library_Desk ViewPoint template ST

I Domain Library_Desk ViewPoint template DF

J Domain User v " - - ' ~ / ViewPoint template ST

I

Figure 1: Library example, relations between ViewPoints

It should be noted that the art of developing a method is not to make all representations equivalent. We are not in the business of simply expressing the same properties of a

system in another style but rather providing a combination of ViewPoint templates that

give "real" complementarity.

We use the same structure that we have developed for ViewPoint templates to describe the overall method. This may at first seem confusing but it allows us to give a systematic

presentation. We distinguish a method from a conventional ViewPoint template by it's

capacity to create ViewPoints and to establish relations between multiple ViewPoints.

We refer to the various components (functions, stores, and so on) within a ViewPoint

template (such as that for data flow analysis) by the notation

<ViewPointTemplateName>.<componentname>. By writing DF".Data flow we refer to the symbols that denote data flow names of the ViewPoint DF". Multiple ViewPoints based on the same ViewPoint template are distinguished by primes/dashes(DF, DF").

Table 3 & 4 show the style and work plan slot respectively for the method NYCE.

Style: me thod NYCE

48

Relation

Trigger

Data_Condition

Definition

Trigger ~ ST.Trans x

DF.Func

Data_Conditi0n ~ ST.State

x Store u DF.Data fiow.F

Informal Description

A transition in a state transition analysis

may correspond to a function in a data flow

analysis. In this case the occurrence of such

a transition is seen as a trigger for the

corres~ndin~ function

A state in a state transition analysis may

correspond to a Data flow or a Store in a

Data flow analysis.

Same_State Same_State ~ ST.State x Two state transition analyses may

ST'.State correspond to each other in terms of their

respective state. This defines the states

which are in common between two

different state transition analyses

Table 3: Style slot for me thod NYCE

The first two of these relat ions are shown in Figures 2 & 3 below. States are represen ted as

drc les and t ransi t ions as a r rows whi le in data f low d iagrams functions are represen ted as

circles and da ta f lows as ar rows respectively.

State-Transition A n a l y s i s ~

Transit ion

Data f low Analysis

on

Figure 2: A poss ible relat ion be tween transi t ions and funct ions

S t a t e _ T r a n s i t i o n ~ . ~ . , ' ' - - ' " - ~ A n a l y s i s f

State

Data f low alysis

Dataf low

Figure 3: A poss ible relat ion be tween states and da ta flows

As w e emphas i zed before it is not necessar i ly the case that for every da ta f low there is a

cor responding state or vice versa. If so, the two representa t ion schemes w o u l d degenera te

to a c o m m o n representa t ion .

Work plan: actions

40

basic actions


heuristics

create DF ViewPoint(DOMAIN,NAME),

create ST ViewPoint!DOMAIN,NAME),

list_relevant_domains -~ Set of Domain, check_consistency(NAME),

relate ViewPoints_of_same_domain(DOMAIN, Set_of_NAME), relate_ ST_ViewPoints(Set_of_NAME), re!ate DF ViewPoints!Set_o/_NAME),

anyViewPoint_too_complex?, possible_con flicts_made_explicit?,

resolve_conflict,...

Work plan informal

1. list_all_relevant_domains

2. for all domains perform in parallel create_STViewPoint and create_DF..ViewPoint

3. checkconsistency locally for each ViewPoint created in step 2

4. relateViewPoints pairwise by either relate_ViewPoints_of_same_domain,

relateST_ViewPoints or relate_DF_ViewPoints respectively

5. if any conflict found then resolve._conflict and start with step 3 again

6. if any~iewPoint_too_complex then try to split the ViewPoint and sta0~ with Step 3 again

Table 4: Work plan for method NYCE

7.4 The Library ViewPoints

In a library there are many people that play a part: users, librarians, inventory clerks,

purchasers, and so on. In our example we will look at two parts of the library: the library

desk (effectively the librarian's perspective) and the library user (Figure 4). Users take a

book from the shelves, present it at the desk and, depending on the status of the book, it

will be either lent to the user or sent to the part of the desk where reserved books are kept.

A user then reads the book and returns it to the desk where it will, depending on the state

of the book, be either given to someone else to process as a returned book or kept in the

special place for reserved books.

Figure 4: Simple library

50

7.4.1 Library desk, state transition analysis ViewPoint: LDS

The style and work plan of LDS is given by the ViewPoint template ST We are interested

in filling in the components of the ViewPoint not already covered by the template

description. These are domain, specification and work record.

The domain of LDS is the library desk of the simple library. The ViewPoint cannot see states such as on_order or finished which are relevant only to the purchase department and library user respectively. The domain defines the boundaries of the knowledge encapsulated by the ViewPoint.

domain: Library_desk

s e e s States: presented, on_loan, checked, removed_from_desk, reserved

The specification of LDS is shown below in Figure 5.

presented checked on_loan

Figure 5: State transition analysis specification of library desk domain

The actions which can be performed are given in the ViewPoint template description

(add_state, add_transition were performed a number of times to arrive at this

specification). The various occurrences of these actions are recorded in the work record.

7.4.2 Library desk, data flow analysis ViewPoint: LDDF

As in LDS above the style and work plan are provided by the ViewPoint template. The

remainder can be completed as follows:

domain : Library_desk

s e e s Funct ions: release, check, lend

s e e s Stores: reserved, released_books

The specification of LDDF is shown below in Figure 6.

51

check lend book . ~ checked book . ~

relea removed_booke

g, " i ~_book ~ load_book

Figure 6: Data flow analysis specification of the library desk domain

7.4.3 Library user, state transition analysis ViewPoint:US

The style of US is the state transition analysis scheme given by ST.

The d o m a i n of US is the library user (from whom the internal workings of the library desk are hidden).

domain: Library User

sees states presented, on_loan, finished, on_shelf

The specification of US is shown below in Figure 7.

presented on_loan

t a k e _ t o _ d e s ~ read

o~- shelf fl'~ished

Figure 7: State transition analysis specification of library user domain.

The work plan and work record should be obvious.

7.4.4 Relations between the ViewPoints LDS, LDDF and US

In Tables 3 & 4 which defines the method NYCE we have set out the rules governing the relation between state transition analysis and data flow analysis. Thus Trigger and Data_Condition relates ViewPoints of the same domain. In the case of ViewPoints of different domains based on the same ViewPoint template we define the relation called Same_State to express the overlap of two state transition analyses. To capture the relation between the actual ViewPoints

The relationship between the ViewPoints LDSand LDDF is defined by giving an instance for the relations of type Trigger and Data_Condition respectively since these ViewPoints

52

are of the same domain but different template as for example Table 5 below.

LDS-LDDF.Trigger ( c LDS.Trans x LDDF.Func)

LDS.Trans LDDF.Func

check check

loan lend

release release

Table 5: Trigger relationship for the ViewPoints LDS & LDDF

This states that the transitions check, loan and release correspond to the functions check,

lend and release of the data flow analysis.

The other correspondence relation describes the data in more detail. An instance of the

relation Data_Condition is given below in Table 6.

LDS-LDDF.Data_Condition ( c LDS.State x Store u LDDF.Data flow)

LDS.State LDDF.Store u LDDF.Data flow

presented

checked

reserved

removed from_desk

on loan

[book

checked_book

reserved

releasedbook

loaned_book

Table 6: Data_Condition relationship for the ViewPoints LDS & LDDF

These relations (Table 5 & 6) define the close relationship of the two library desk

ViewPoints and link the state transition perspective with a functional perspective on this

domain.

Clearly we must also describe the relationship between the ViewPoints LDS and US. That

is we need to say how different parts of the system -the library desk and the library user -

overlap. We must define their respective roles within the library world. We can do this by

giving an instance of the relation Same_State as in Table 7.

LDS-US.Same_State (~ LDS.State x US.State)

LDS.State US.State

presented [ presented

on_loan I on_loan ....

Table 7: Same_State relationship for ViewPoints LDS & US

This relation establishes that the overlapping states are presented and on loan (the user

does not see the internal workings of the library and vice-versa).

53

8 Conclusions

In this paper we have shown how work on methods to support requirements formalisation, incremental development of formal specifications, tool support for requirements expression and modelling requirements elicitation has lead us to an understanding of the central role which viewpoints play in requirements engineering. We have suggested that ViewPoints might provide a basis for unifying models of software process and models of software structure, developing an overarching structural framework for software development which incorporates requirements engineering, supporting the use of multiple representation schemes, providing a systematic basis for constructing and presenting methods. We have attempted a systematic characterisation of ViewPoints and illustrated this with a small example.

The work on ViewPoints which this paper reports is in it's early stages and requires considerable further work. Our final aim is to complement our intuitive use of ViewPoints with a comprehensive formal description. We are investigating the use of M[A]L as a suitable base for such a description. We also hope to develop requirements engineering support tools based on the ViewPoint approach.

Our short term goals include developing descriptions, in the ViewPoint style, of a repertoire of standard software development methods such as SSADM and JSD. Other short term goals include the development of a ViewPoint based method for developing reconfigurable and extensible distributed systems.

Acknowledgements

The authors would like to thank their colleagues and students, in particular Tom Maibaum, Jeff Magee and Hugo Fuks, for the lively critical discussion which has contributed significantly to the work this paper reports. Celso Niskier is supported by the Brazilian National Research Council CNPq, grant 20.2518/86-CC. Michael Goedicke is a Visiting Research Fellow on leave from the University of Dortmund, FRG.

References

Cunningham J., Finkelstein A., Goldsack S., Maibaum T. & Potts C. (1985); Formal Requirements Specification - The Forest Project; Proc. 3rd International Workshop on Software Specification & Design; pp 186-191, IEEE CS Press.

Finkelstein A. & Fuks H. (1989); Multi-Party Specification; Proc 5th International Workshop on Software Specification & Design; pp 185-196, IEEE CS Press [Also ACM Software Engineering Notes May 1989].

Finkelstein, A. & Hagelstein, J. (1989); Formal Frameworks for Understanding Information System Requirements Engineering: a research agenda; [To appear] Nijssen, S. & Twine, S.(Eds) IFIP CRIS Review Workshop; North-Holland.

54

Finkelstein, A. & Potts, C. (1987); Building Formal Specifications Using "Structured Common Sense"; Proc. 4th International Workshop on Software Specification & Design; IEEE CS Press.

Goedicke M., Ditt W., Schippers H. (1989); The I]-Language Reference Manual, Research Report No 295 1989, Department of Computer Science, University of Dortmund.

Goedicke, M.(1989); Paradigms of Modular Software Development (to appear) Mitchell tLJ. (Ed); Managing Complexity in Software Engineering; Peter Peregrinus, Stevenage, England

Kaiser G., Feller P., Popovich S. (1988); Intelligent Assistance for Software Development and Maintenance; IEEE Software, May 1988, pp40-49

KeUner M., Hansen G. (1989); Software Process Modelling: a case study; Proc. Hawaii International Conference on System Sciences - 22, V2, pp175-188; IEEE CS Press.

Khosla S., Maibaum T. (1989); Time, Behaviour and Function; (In) Barringer H., Proc. Colloquium on Temporal Logic & Specifications (Handbook of Temporal Logic); LNCS Springer Verlag.

Kramer J. & Magee J. (1985); Dynamic Configuration for Distributed Systems; IEEE Trans. Software Engineering; SE-11,4, pp 424-436.

Kramer J., Finkelstein A., Ng K., Potts C. & Whitehead K. (1987);"Tool Assisted Requirements Analysis: TARA final report"; Imperial College, Dept. of Computing, Technical Report 87/18.

Mackenzie, J. (1981); The Dialectics of Logic; Logique et Analyse, V24, pp 159-177.

Mullery, G. (1985); Acquisition - Environment; (In) Paul, M. & Siegert, H. "Distributed Systems: Methods and Tools for Specification"; Springer Verlag LNCS 190.

Niskier C. & Maibaum T. (1989); Acquisition, Classification and Formalisation of Software Specification Heuristics; Proc. 3rd European Knowledge Acquisition Workshop.

Niskier C., Maibaum T. & Schwabe D. (1989); A Look Through PRISMA: towards knowledge-based environments for software specification; Proc 5th International Workshop on Software Specification & Design; pp 128-136, IEEE CS Press [Also ACM Software Engineering Notes May 1989].

Robinson W. (1989); Integrating Multiple Specifications Using Domain Goals; Proc 5th International Workshop on Software Specification & Design; pp 219-226, IEEE CS Press.

Stephens, M. & Whitehead, K. (1985); "The Analyst D A Workstation for Analysis and Design"; Proc 8th ICSE; IEEE CS Press.

Wile D. (1983); Program Developments: Formal Explanations of Implementations; CACM 26(11), pp 902-911.

Using Transformations to Verify Parallel Programs

Ernst-Riidiger .Olderog Department of Computer Science

University of Oldenburg 2900 Oldenburg

Federal Republic of Germany

Krzysztof R. Apt Centre for Mathematics and Computer Science

Kruislaan 413, 1098 SJ Amsterdam The Netherlands

and Department of Computer Sciences

University of Texas at Austin Austin, TX 78712-i188

U.S.A.

A b s t r a c t We argue that the verification of parallel programs can be considerably simplified by using program transformations. We illustrate this approach by proving correctness of two parallel programs under the assumption of fairness: asynchronous fixed point computation and parallel zero search.

1 Introduction

The aim of this paper is to show how program transformations can simplify the task of proving parallel programs with shared variables correct. To this end, we present four transformations all of which preserve partial and total correctness and fairness, and which consequently can be used in proofs of these correctness properties.

The first transformation links parallel programs to nondeterministic sequential ones. This is as in the work of Ashcroft and Manna [1971], Flon and Suzuki [1981] and, more recently, Back [1989] and Chandy and Misra [1988]. However, to avoid the introduction of auxiliary variables that would destroy the program structure, we present this transformation only for a restricted class of parallel programs.

To enhance the usefulness of this transformation, we combine it with two transformations on parallel programs which introduce more points of interference. These transformations are inspired by Lipton [1975]. Whereas Lipton considered only ordinary termination proofs, we deal here also with fairness.

56

Fair termination is proved on the level of nondeterministic programs by reducing it to ordinary termination with the help of a fourth transformation due to Apt and Olderog [1983] which makes use of random assignments.

Considered in isolation these transformations look very simple but when combined they can substantially reduce the task of verification. This reduction is achieved by delaying the assertional correctness proof as much as possible, viz. after a stepwise transformation of the original parallel program into a well- structured nondeterministic program. The proposed transformations can also be used to construct parallel programs from nondeterministic ones.

We illustrate our approach by proving total correctness of two parallel programs under the assumption of fairness: asynchronous fixed point computation and parallel zero search.

There are two alternatives to these correctness proofs. The first one is to use the transformational approach to fairness in parallel programs presented in Olderog and Apt [1988]. It calls for proving ordinary total correctness of a transformed parallel program simulating the fair computations of the orighaal program~ Another possibility is to first translate the original program directly into a nondeterministic program as in Flon and Suzuki [1981] and then use one of the available methods for proving correctness of a nondeterministic program under the assumption of fairness (see Francez [1986] for their overview).

In both cases the verification becomes extremely tedious and complicated because the transformations of Olderog and Apt [1988] and Flon and Suzuki [1981] introduce auxiliary variables that destroy the structure of the original program.

Besides the two parallel programs we also prove correctness of the program transformations themselves (except of the one taken from Apt and Olderog [1983]). These proofs appear in the appendix to our paper and are based on a simple operational program semantics due to ttennessy and Plotkin [1979].

2 Preliminaries

Throughout this paper we mean by a parallel program a program of the form

So; IS111 ... 11 sn]

where each Si is a while-program. We call So an initialization s¢alement and each Si for i > 0 a componen~ program. Within the component programs we additionally allow atomic regions. Syntactically, these are loop free while-programs enclosed in angle brackets ( and }. Sometimes we write [[1~--I Si] instead of [$1 I1 ... II Sn]. Note that $1,. . . , Sn may share variables.

Intuitively, an execution of [$1.1]. • .liSa] is obtained by interleaving the atomic, i,e. non-interruptible steps in the executions of the components $1, . . . , Sn. By definition, Boolean expressions, assignments, the skip statement and atomic regions are all evaluated or executed as atomic steps. As atomic regions are

57

required to be loop free, their execution is guaranteed to terminate. An interleaved execution of [$1 []...[[Sn] terminates if and only if the individual execution of each component terminates.

For convenience, we identify

(A) - . A

if A is an assignment or skip. A s~a~e is either a proper sta~e, i.e. a mapping from variables to values, or a

special symbol A. denoting divergence. We consider here three semantics of parallel programs, all referring to an

interleaving model of execution. Given a parallel program S we distinguish:

. partial correctness semantics .h4[S],

• total correctness semantics .h~ttot[S],

• fair parallelism semantics M/air[S].

In the partial correctness semantics, given an initial proper state, only the final proper states are recorded. In the total correctness semantics additionally a possibility of divergence is recorded as 2-. Finally, the fair parallelism semantics is like the total correctness semantics but only the fair computations are taken into account. A computation of a parallel program is called fair if each component that has not yet terminated is eventually activated again. In particular, every finite computation is fair.

For details concerning the semantics we refer to the appendix. Each of these three semantics induces a corresponding notion of program correctness. We thus distinguish between

• partial correctness ~ ,

• total correctness ~tot ,

• fair total correctness ~fair •

Each of these correctness notions refers to a correctness formula, i.e. a construct of the form {p} S {q} where p and q are assertions and S a program. We assume from the reader some knowledge of the basic concepts on program verification.

3 T r a n s f o r m a t i o n s

We now present four program transformations. The first of them transforms a nondeterministic program in the sense of Dijkstra [1975] into a parallel program. We study here only one level nondeterministic programs, i.e. programs of the form

58

S - S o ; don~= 1 Bi'-'~Si od

where the subprograms Si are loop free while-programs. For these programs we refer to the same three semantics and program cor-

rectness notions as those introduced above. The notion of a fair computation is obtained here by considering enabled branches of a do-loop instead of nonter- minated components of a parallel program.

T h e o r e m I (Para l le l iza t ion) Consider a one level nondeterministic program

S - S o ; d o • ~ = 1 B--*Si od,

the parallel program

T - So; [ltn=t whi le B do (S/> od]

and two assertions p and q. Suppose that for every i 6 {1, . . . , n}

{q ^ (q ^ -,B).

Then

{p} S {q} iff ~ {p} T {q}

and analogously for ~tot and ~/ai,-.

Proof . See the appendix. []

The Parallelization Theorem transforms do-loops with identical guards into parallel programs of a very restricted format. In particular, components that are while-loops consisting only of a single atomic region are rare in practice. To enhance the usefulness of the Parallelization Theorem we shM1 combine its application with two additional transformations of parallel programs which introduce more points of interference. These transformations are inspired by Lipton [1975].

We say that two programs are disjoint if none of the variables which can be changed by one of them appears in the other. We say that a Boolean expression B is disjoint from a program S if none of the variables which can be changed by S appears in B.

The next transformation reduces the size of atomic regions.

T h e o r e m 2 (Atomlc i ty ) Consider a parallel program S ~ So; [Sill...llS.]. Let T result from S by replacing in one of its components, say Si with i > 0, either

59

• an atomic region (RI; R2) where one of the Rt's (l E {I, 2}) is disjoint from all components S i with j # i by

(nj; <n2)

o r

~, an atomic region (if B t h e n R1 else R2 fi) where B is disjoint from all components S i with j # i by

if B t h e n (/~l) else (R2) ft.

Then the programs S and T have the same semantics, i.e.,

~[S]- .£4[T],

and analogously for Mtot and J~falr.

Proof . See the appendix. o

Corol lary 3 (Atornici ty) Under the assumptions of the Atomicity Theorem, for all assertions p and q

{v} s {q} ifr b {v} T {q}

and analogously for ~tot and ~]air • []

The Atomicity Theorem describes a simple but very useful transformation on parallel programs. The given program S has a coarser grain of atomicity than T - - it has less points for possible interference among its components and thus admits fewer computations. Therefore S is easier to prove correct than T, either directly by using a proof systems for proving correctness of parallel programs or, if possible, by using the Parallelization Theorem. On the other hand, the resulting program T has a finer grain of atomicity and is thus more realistic than S.

The third transformation moves initializations inside the parallel composition.

T h e o r e m 4 (Ini t ia l izat ion) Consider a parallel program of the form

s - so; n0;

Suppose that for some index i E {I .... , n} the initialization part Ro is disjoint from all component programs S i with j # i. Then the program

60

T So; [S, II...llRo; &ll...IIS.]

has the same semantics as S, i.e.

fl4[S] = A4[T],

and analogously for .M,ot and .M fair.

P roof . See the appendix. []

Coro l l a ry 5 ( Ini t ia l izat ion) Under the assumptions of the Initialization The- orem, for all assertions p and q

{p} S {q) iff ~ {p} T {q)

and analogously for ~tot and ~lair • []

Again, the given program S admits fewer computations and is easier to prove correct whereas the transformed program T has more points for possible interference.

To reason about fair total correctness of nondeterministic programs, we use a program transformation, originally proposed in Apt and 01derog [1983], which reduces this notion of correctness to ordinary total correctness. This transformation embeds into a given nondeterministic program an abstract scheduler that implements the fairness policy. This scheduler initializes, reads and updates private variables by using random assignments of the form

which assign an arbitrary non-negative integer to an integer variable z.

T h e o r e m .6 (Fairness) Consider a one level nondeterministic program

S - S o ; doD~= I Bi---~Si od.

Let T be obtained from S as follows:

T =_ INIT; So; do []i"=1 Bi A SCHi-..* UPDATEi; Si od

where for variables zl, . . . . z, not occurring in S

61

I_NIT -- zl :=?; ...; z. :=?,

S C H i -- z~ = m i n { z k I k 6 {1,...,n} and B~},

UPDATEi - zi :=?; for all j E {1 . . . . , n} - {i} do

i f B 1 t h e n z j : = z j - l f i od.

Then

M/~i~[S] = ,~ to t IT~ rood {zl, . . . , z,},

where the rood-notation means that the final states agree modulo {z l , . . . . z , } , i.e. on all variables except z l , . . . , zn.

Proof . See Apt and Olderog [1983].

Corol lary 7 (Fairness) Under the assumptions of the Fairness Theorem, for all assertions p and q which do not contain the variables zl , . . . , z,

F/air {P} S {q} iff ~tot {P} T {q}

r'l

4 A s y n c h r o n o u s f i x e d p o i n t c o m p u t a t i o n

As a first application of the Parallelization Theorem let us consider the problem of asynchronous fixed point computation studied in Apt and O!derog [1983]. We considered there a monotonic operator F : L " ~ L" on the n-fold product of a complete lattice L with the finite chain properly (no infinite strictly growing sequence exists). We proved that under the assumption of fairness the nondeterministic program

S -- do Q~=l ~ # F(~) -* zi := F~(~) od

computes the least fixed point ofF:

F~ stands for the {-th component function Fi : L" --* L of F defined by

ri(~l,..., ~.) = ~, if F(~I,..., ~.) = (yl,..., y.),

abbreviates (z l , . . . , zn) and _1_ denotes the least element in L n.

62

Now we wish to parallelize S. To this end, we check the condition of the Parallelization Theorem, i.e. whether

~,o~ {e = ~F ̂ ~ = F(e)} x, := F,(e) {e = , F ^ e = F(e )} (1) for all i G {1 , . . . , n}. By the definition of F~, the precondition ~ = F($) implies that for all i G {1 , . . . , n}

x, = El(e).

Hence for all i E {1 , . . . , n} the value of x~ remains unchanged under the assignment x~ := F~(~). Thus (1) holds and the Parallelization Theorem yields that under the assumption of fairness the parallel program

T = JILL1 whi le ~ # F(~) do x~ := Fi(~) od]

also computes the least fixed point of F:

~io~r {e = ±} r {e = , F } .

5 P a r a l l e l z e r o s e a r c h

The next example illustrates how all four transformations can be combined to verify a parallel program. We prove that under the assumption of fairness the parallel program

with

and

S - found := false; [SzllS2]

S1 = x := O; whi le --,found do

z : = x + 1;

if f(~) = 0 t h e n found := t r u e fi od

$2-- y : = 1; whi le -',found do

y := y - 1; if f(y) = 0 t h e n found := t r u e fi

od

finds a zero of the function f provided such a zero exists:

~fa,r {3u : f(u) -- O) S {f(~) -" 0 V f(v) -- 0}.

We proceed in 5 steps.

(2)

63

Step 1. Simplifying the program

We first use the Atomicity Corollary and Initialization Corollary and reduce the original problem (2) to the following claim

where

with

and

b/air {3u : f (u) = 0} T { f ( x ) = 0 V f (y) = 0} (3)

T = found := false; z := 0; y := 1; [TIIIT~]

T1 -= whi le -. found do (x:=x+l;

if f(x) = 0 t h e n found := t r u e fi> od

T2 - whi le ~/ound do ( y := y - 1;

if f (y) = 0 t h e n found : - t r u e fi>. od

Both corollaries are applicable here by virtue of the fact that x does not appear in $2 and y does not appear in $I. Recall that by assumption assignments and the skip statement are considered to be atomic regions.

Step 2. Decomposing fair total correctness

To prove (3) we use the fact that fair total correctness can be decomposed into fair termination and partial correctness. More precisely we use the following observation.

L e m m a 8 For all nondeterministic or parallel programs R and all assertions p and q

blair {P} R {q} i• b/air {P} -~ {true} and b {P} R {q}.

P r o o f By the definition of fair total correctness and partial correctness. D

Thus to prove (3) it suffices to prove

bso~r { ~ : / ( ~ ) = o} T {true} (4) and

b {3~ : / (~) = o} T {f(x) = 0 v / ( y ) = 0}. (5)

64

Step 3. Reduct ion to nondeterminism

To prove (4) we use the Parallelization Theorem. Consider the following nondeterministic program

T' - found := false; z := 0; y := 1; do -~found--+ z := z + 1;

if f (z) = 0 t h e n found := t r u e fi o -~found~ y := y - 1;

i f f(y) = 0 t h e n found := t r u e fi od.

Clearly

and

~tot { t r u e A found} z := z + I; if f (z) = 0 t h e n found := t r u e fi

{ t r u e A found}

~,ot { t r u e A found} y := y - 1; if f(y) = 0 t h e n found := t r u e fi

{ t r u e A found}.

Thus by the Parallelization Theorem, to prove (4) it suffices to prove

~I~r {3u: f (u) = 0} T' {true}. (6)

Step 4. Proving fair termination

To prove (6) we use a proof rule for fair total correctness of one level nondeterministic programs, introduced in Apt and Olderog [1983]. This rule is obtained from the Fairness Corollary 7 by absorbing, as it were, the scheduler parts 1NIT, SCHi and UPDATEi referring to the scheduling variables z l , . . . , z , of the transformed program into the pre- and postconditions.

For the case of the identical loop guards this proof rule reads as follows:

FAIR LOOP RULE

(0 {P ̂ B} & {P},i 6 {1,...,n}, (ii) {v ^ B ^ ~ >_ 0 ^ 3z~ >_ 0 : t[zi + 1 / z A j ~ = ~}

& {, < ~ } , i e {1, . . . ,n},

(iii) p A ~ > O - - t 6 W {p}doO" B 'S~od {V^-~B} i = 1 - ' *

65

where

• t is an expression which takes values in a partial order (P, <) tha t is well- founded on the subset W C_ P,

• z l , . . . , z , are integer variables that may occur freely in t, but not in p, Bi or & , for i ~ {1, . . . . n},

• t[z~ + 1/zj]j¢~ denotes the expression tha t results from t by subst i tut ing for every occurrence of zj in t the expression zj + 1; here j ranges over the set {1,.. . , n} -{ i} ,

• 2 > 0 abbreviates zl > 0 A . . . A zn > 0,

• a is a simple variable ranging over P and not occurring in p, t, Bi or St, for i E { I , . . . , n}; its purpose is to freeze the value oft[zi + 1/zi]i;ei before the execution of S/.

Note tha t with the precondition of premise (ii) simplified to

p A B A t = c ~

and premise (iii) simplified to

p-- . t E W,

we obtain the usual rule for total correctness of nondeterminist ic do-loops. The above usage of the variables z l , . . . , zn in the premises allows us to establish fair total correctness.

We call p the invariant of the loop and t the bound funclion of the loop. In the proof outlines we denote them by lay: p and bd : t, respectively.

We use the above rule to first prove a weaker fair terminat ion result than (6), viz. where f has a zero u > 0:

~fair { f ( u ) = 0 A u > O} T ' { t r u e } .

A proof outline for (7) has the following structure:

= 0 ^ u > 0} f ound := false; z := 0; y := 1; { f (u ) = 0 ^ u > 0 ^ -~fo~,nd A ~: = 0 ^ y = I} { i n v : p } { b d : t} d o ",found--.* {p A --,found}

z : = z + 1;

i f f ( z ) = 0 t h e n found := t r u e fi

(7)

66

[] ~ found

o d {p A found} {true}.

{v} {p A ~found} y : = y - 1 if f(y) = 0 t h e n found := t r u e fi {v}

It remains to find a loop invariant p and a bound function t that will complete this outline.

Since the variable u is left unchanged by the program S, certainly

f (u) -- 0 A u > 0

is an invariant. But for the completion of the proof outline we need a stronger invariant relating u with the program variables z and found. We take as an overall invariant

p = _ f ( u ) = O A u > O A z _ O A'-,found A z = O A y = l .--* p

and

p A found ~ t r u e

are obviously true and thus confirm the proof outline as given outside the do- loop.

To check the proof outline inside the loop, we take as partial order the set

P = Z x Z ,

ordered lexicographically by <z,~ and well-founded on the subset

W = 2¢0 x N0,

where Z denotes the set of integers and N0 the set of natural numbers. As a bound function we take

t = •

In ~ the scheduling variable zl counts the number of executions of the second loop component before the next switch to the first one, and u - z, the distance

67

between the current test value x and the zero u, counts the remaining number of executions of the first loop component.

We show now that our choices of p and t complete the overall proof outline as given inside the do-loop. To this end, we have to prove the premises of the Fair Loop Rule.

We do this for the second premise. For the first loop component we have the proof outline:

{ " - , foundAf(u)=OAu>OAz 0 A z 2 > 0 A 3 z l > 0 : = a }

{3zi > 0 : = ~} { < t , ~ }

z : = z + l ; { <z ,~ or}

found := f(z) = 0 { <z ,~ ~} {*- <z,= ~}.

Thus the bound function t drops below o~ because the program variable z is incremented into the direction of the zero u.

For the second loop component we have the proof outline:

{ ~ f o u n d A f (u ) = 0 ^ u > 0 ^ z _0Az2_>0A = ~ )

1 > = ~ } <~le~ Or}

{

y : = y - 1; found := f(y) (< U--.~Z 1 >'

= 0 <lez Or)

Notice that only with the help of the scheduling variable zl we can prove that the bound function t drops here below a; the assignments to the program variables y and found do not affect t at all.

The remaining two premises can be easily established. This completes the proof of (7).

Symmetrically we can deal with the case when f has a zero u < 0:

~fair (f(u) "- 0 A u ~ 0} T' ( true}. Combining this with (7) by standard rules 'of Hoare's logic yields (6).

S t e p 5 . P r o v i n g p a r t i a l c o r r e c t n e s s

It remains to prove (5). To this end, we use the approach of Owicki and Gries [1976] and Lamport [1977]. First we need to construct interference free proof outlines for partial correctness of the component programs 7"1 and 7"2 of T.

68

For TI we use the invariant

Pl --- x>_o A (found--* (m > 0 A f(m) = O) V (y < 0 A f (y ) = 0)) A (", found A m > 0 -* f (x ) # O)

(8) (9)

(10)

to construct the proof outline

{inv : pl} while -~found do

{x > o A ( found ~ y < 0 A f(y) = 0) ^ (~ > 0 -~ f(~) # 0)}

( z : = z + l ; if f (z) = 0 t h e n found := t r u e fi)

o d {Pl A found}.

(11)

Similarly, for T2 we use the invariant

p~= y _ < l A ( found-* (~ > 0 A f(x) = O) V (y <_ 0 A f(y) -- 0)) A (~found A y <_ 0 -* f(y) ~: O)

(12) (13) (14)

to construct the proof outline

{inv :/~} while -, found d o

{y < l A (found-'-+ x > O A f(z) - O) ^ (y <_ o - f(y) ~- o)}

( y : = y - 1 ; if f (y) = 0 t h e n found := t r u e fi)

od {p2 A found}.

The intuition behind the invariants pl and p2 is as follows. Conjuncts (8) and (12) state the range of values that the variables z and y may assume during the execution of the loops TI and T2.

Thanks to the initialization of m with 0 and y with 1 in T, the condition z > 0 expresses the fac t that the loop T1 has been traversed as least once, and similarly the condition y < 0 expresses the fact that the loop T2 has been traversed at least once. Thus the conjuncts (9) and (13) in the invariants/>1 and /~ state that if the variable found is true, then the loop 7"1 has been traversed at least once and a zero z of f has been found, or that the loop T2 has been traversed at least once and a zero y of f has been found.

69

The conjunct (10) in Pl states that if the variable found is false and the loop T1 has been traversed at least once, then x is not a zero of f. Analogously for the conjunct (14) in p2.

Let us discuss now the proof outlines. In the first proof outline the most complicated assertion is (11). Note that

Px A ~found ~ (11)

as required by the definition of a proof outline. Given (11) as a precondition, the loop body in T1 establishes Pl as a post-

condition, as required. Notice that the conjunct

f o u n d r y < 0 A f(y) = 0

in the precondition (11) is necessary to establish the conjunct (9) in the invariant Pl.

Next we deal with the interference freedom of the above proof outlines. In total 6 correctness formulas have to be proved, 3 for each component, pairwise symmetric.

The most difficult case is the interference freedom of the assertion (11) in the proof outline for TI with the loop body in T2. It is proved by the following proof outline:

{ :c > 0 A (found--+y < 0 A f(y) - O) A (z > O ~ f(=) 7£ O) A y < l A ( f o u n d ~ x > O A f ( = ) = O ) A ( y < O - - + f ( y ) # O ) }

{= >_ 0 A y < 1 A --,found A (x > O-- f ( z ) # 0)} ( y := y - 1;

i f f(y) --- 0 t h e n found : - t rue fi) {z >_ 0 A (found --.+ y <_ 0 A f(y) = O) A (z > 0---* f (z) # 0)}.

Note that the first assertion in the above proof outline indeed implies ~found:

(found--, (= > o ^ f(=) = o)) ^ (= > o - f(=) # o)

implies

found --.+ (f(x) # 0 A f(x) -" O)

implies

~ found.

This information is recorded in the second assertion of the proof outline and used to establish the last assertion.

The remaining cases in the interference freedom proof are straightforward and left to the reader.

70

We now apply the rule of parallel composition and get

{vl ^ w} [TII[T ] {w ^ w ^ .found}.

From this correctness formula it is straightforward to prove the desired partial correctness result (5).

This concludes the proof of (2).

Discussion

(i) In the above proof we first simplified (2) to (3) and then decomposed (3) into (4) and (5). Clearly we could have decomposed in an analogous way (2). But this would lead to a much more complicated proof of partial correctness because S contains more interference points than T. In particular, to deal with the initialization z := 0 and y := 1 within the parallel composition in S requires the use of auxiliary variables.

This shows that the Atomicity and Initialization Theorems simplify the task of proving parallel programs correct.

(ii) To prove (4) we used the Parallelization Theorem. It is useful to note that we cannot use it to prove (3) directly. Indeed, to apply it we would have to prove

{(,f(m) = 0 V .f(y) = O) A .found} z := z + 1; if f ( z ) = 0 t h e n .found := t r u e fi

= 0 v , f ( y ) = 0) ^ .found}

and a similar claim for the second component. However, the above claim does not hold as the assignment z := x ÷ 1 can invalidate the assertion ,f(x) = 0.

This shows that the Parallelization Theorem is of limited applicability and has to be used in conjunction with other methods.

(iii) To prove fair termination of T I in (6) or (7) we could have applied the Fairness Corollary 7 to T ~ and proved ordinary termination of the transformed version of T ~. However, we preferred to use the Fair Loop l~ule presented in Step 3 because it allowed us to reason directly about the original program T'. In this way certain parts of the transformation are handled uniformly and a generation of several intermediate assertions (for example dealing with random assignments) is avoided.

71

Appendix In this appendix we prove Theorems 1 and 2. As a preparation we define rigorously the program semantics. We use here the operational approach due to Hennessy and Plotkin [1979]. Its basic concept is a configuration which is simply a pair < S, ~, > consisting of a program S and a proper stale o'. The semantics is then defined in terms of transitions. Intuitively, a transition

< S , o ' > "-+ < R , r >

means: executing S one step in a proper state ~ can lead to state 7- with R being the remainder of S still to be executed. To express termination we allow the empty program E inside configurations: R = E in in the above transition means that S terminates in r . We stipulate that E; S and S; E abbreviate to S. Also, we identify

JEll . . . l l E] - Z .

This expresses the fact that a parallel program terminates iff all its components terminate.

In the following or, 7- stand for proper states, i.e. mappings from variables to values. We write ~(t) to denote the value of an expression ~ in o" and o" ~ B to express that the Boolean expression B evaluates to true in ~. Further on, o'[o'(t)/u] is a proper state that agrees with ~ except for the variable u where its value is ~(t). The transition relation --~ is defined by induction on the structure of programs. We use the following transition axioms and rules:

(i) < s k i p , ~ > -- , < E , ~ >,

(ii) ---, < E,~[c~(t)/u] >,

(iii)

< S 1 , c r > --* < S 2 , r >

< & ; S , ~ > ~ < $ 2 ; S , r >

(iv) < if B t h e n $1 else $2 fi, cr > ~ < $1, o" > where ~ ~ B,

(v) ---, < S~., cr > where ~r ~ "~B,

(vi) < do O~= I Bi ~ Si o d , ~ > --* < Si; do 0~= I Bi ~ Si o d , o, > where ~ ~ Bi and i E { 1 . . . . , n},

(vii) < do On=1 Bi --~ Si o d , ~ > --* < E, ~ > where ~ ~ A~= 1 -~Bi.

(viii)

< S,o" > - . " < E , r >

<(S}> --* <E,r>

72

(ix)

< Si ,c r> ~ <hq , r >

< [Sill.. ,IISHI...IIS ], > < [S II...IIT II...IIS,], >

where i E { 1 , . . . , n } .

By definition the transitions for whi le B d o S o d are as for d o B --* S o d . Rule (viii) formalizes the intuitive meaning of atomic regions by reducing each terminating computation of the "body" S of an atomic region (S) to a one step computation of the atomic region. Rule (Lx) states that a parallel program [,911]...IIS,] performs a transition if one of its component performs a transition. Thus concurrency is modelled here by interleaving.

A transition < S, ~r > ~ < R, r > is possible if and only if it can be deduced in the above transition system.

D e f i n i t i o n 9 Let S be a parallel or nondeterministic program and ~ a proper state.

(i) A transition sequence of S starting in ~r is a finite or infinite sequence of configurations < S/, ai > (i > O) such that

< S , ~ ' > = < S 0 , ¢ 0 > ~ < $ 1 , ¢ 1 > ~ . . . ~ < Si,cr~> ~ . . .

(ii) A computation of S starting in cr is a transition sequence of S starting in c' which cannot be extended.

(iii) A computation of S is terminating in r (or terminates in r) if it is finite and its last configuration is of the form < E, 1" >.

(iv) A computation of S is diverging (or diverges) if it is infinite. S can diverge from ~ if there exists an infinite computation of S starting in o'.

El

Let --** stand for the transitive, reflexive closure of ---~. We now define three semantics of parallel or nondeterministic programs by putting for a proper state 6r

{rl< < Z , r > } ,

A4tot[S~(~r) = .M[S](~)U {-t-IS can diverge from ~,},

M/ai,[S](~r) = M[S](cr) U {-L I S can diverge from ~ by a fair computation.}

73

The corresponding notions of partial, total and fair total correctness of programs can be defined as inclusion properties of sets of states. For partial correctness we put

# {v} s {q} ifr aa[sl([v])c_ [q]

where ~ ] is the set of all proper states satisfying the assertion p and analogously for q. The definitions for ~tot and ~/ai~ refer to A4tot and .h/flair instead.

P r o o f o f T h e o r e m 1. We proceed in 6 steps.

S t e p 1 We consider the case when S and T have no initialization part So and introduce a subset of computations of T. To this end, observe that in an arbitrary finite or infinite transition sequence

~ : < T , ~ > = < T ~ , ~ I > - - , . . . - , < 7 ) , ~ i > --*.. .

ofT, each transition < ~ , ~j > --* < ~ + t , ~rj+t > in ~ is of one of the foll6wing three types.

It can be a Bi.transi~ion passing succcessfully the loop condition B in the i-th component so that

= [...[[while B do (Si) odl[...] and ~ ~ B,

3)+1 = [...II(S4; while B do (S;) odl[...] and ~j+t = vj';

it can be an Si-transi~ion executing the loop body St as an atomic action so that

~D = [...11(S4; while B do (Si) odll...], ~+~ = [...llwhile B do (Si) odll-..];

or it can be an Ei4ransition terminating the loop of the i-th component so that

Tj _= [...llwhile B do (Si) odll...] and ~i ~ -~B, ~+1 = [...IIE]I...] and ¢~+1 = ~i.

We say that ~ is delay free if each Bi-transition is immediately followed by the corresponding Si-transition.

Note t h a t in a delay free computation of T for each St-transition < ~ , o'j > --* < 7~+1, crj+t >

_= [while B do (Sl) odll.. . II(S4; while B do (St) odll . . . Hwhile B do (S,) odl[...]

74

and

Tj+I ~ [while B do ($1) od[J... [[while B do (Si) od[[... [[while B do (S,~) od[[...]

---T.

Also, after an Ei-transition only Ej-transitions for i 7 ~ j can take place.

S tep 2 To compare the computations of S and T, we use the following notion of equivalence. Two computations are called i/o eq=ivalen~ if they start in the same state and either both diverge or both terminate in the same state.

Step 3 We prove the following two claims:

* every (fair) computation of S is i/o equivalent to a delay free (fair) computation of T,

• every delay free (fair) computation of T is i/o equivalent to a (fair) computation of S.

First consider a (fair) computation ~ of S. We construct an i/o equivalent delay free (fair) computation of T from ( by replacing

• every loop entry transition

<S , cr > ---, <St ; S,~,>

with the Bi-transition

< T, cr > --~ < [while B do ($1) odl]... [l(Si); while B do Si od[I... Ilwhne B do S, odll..., ~ >,

• every transition subsequence

<St ; S , ~ > - . . . . - - * < S , r >,

forming the stepwise execution of the loop body St, with the Si-transition

< [while B ao (Sl) odll... II(s,); wane B do S, odll. . . llwhile B do S. odl] .... o" > --~ < T, r >,

75

• every loop exit transition

<S,~> --+ <E,~>

with a sequence of n final E~-transitions, i 6 {1, . . . , n}, dealing with the state ~.

Now consider a delay free (fair) computation 77 of T. By applying the above replacement operations in reverse direction, we construct an i/o equivalent (fair) computation of S from ~?.

S t ep 4 To compare computations of T, we introduce the following variant of i /o equivalence. Two computations are called q-equivalerd if they both start in the same state and either both diverge or both terminate in a state satisfying assertion q.

S t ep 5 By a p-computation we mean a computation starting in a state satisfying the assertion p. Suppose that every terminating delay free p-computation of T terminates in a state satisfying the assertion q. We prove that under this assumption every (fair) p-computation of T is q-equivalent to a delay free p- computation of T.

Consider a (fair) computation

- - , . . . - - - , < :9, j > - - . . . .

of T with ~r ~ p.

Case 1 Vj >_ 1 : o'1 ~ B. Then ~ is infinite. Let

be the sequence of all Si-transitions in ~. Then there exists an infinite delay free (fair) p-computation 77 of T which starts in ~ and has the same sequence of St- transitions. We can construct 77 by performing the corresponding B~-transitions immediately before the St-transitions of this sequence. This is possible because in the present case the B~-transitions are everywhere enabled.

Case 2 3j >_ 1 : 0" 5 ~ -,B. Let j0 be the smallest such ]. Consider the prefix

:< > = < TI, I > < Tio,O'io >

of f. By the choice of j0, the last transition in ~0 is an St-transition. We first show that ~j0 ~ q. To this end, we argue in a similar way as above.

Let

76

be the sequence of all Si-transitions in (0. Then there exists a finite delay free transition sequence r/0 of T starting in ~, running through the same Si - transitions as ~0, and ending in the configuration < T, ~rjo >. Note that we indeed obtain here the program T thanks to the observation about St-transitions in delay free transition sequences stated in Step 1. Since ~jo ~ -~B, the only transitions which are possible after < T, crio > are Ei-transitions, i E {1,. . . , n}. By adding all these transitions, we obtain a delay free p-computation ~7 of T terminating in ~rjo. By the assumption of this step, ~'jo ~ q.

Thus ~jo ~ q A -~B. This information is sufficient to see how the original computation ~ of T continues after the prefix ~0. In ~0 there may be some Bi-transitions without a corresponding Si-transition. Since by assumption

b {q ̂ s{ {q ̂

these remaining Si-transitions all yield states satisfying q A -~B. Thus these Si-transitions and n final Ei-transitions are the only possible transitions in the remainder of ~. Thus also ~ terminates in a state satisfying q. Consequently, and the delay free computation r/are q-equivalent.

Step 6 By combining the results from Step 3 and 5, it is easy to prove the claim of the theorem for the case when S and T have no initialization part So. The first claim of Step 3 implies the "if"-part. The second claim of Step 3 together with the result of Step 5 imply the "only-if'-part. Indeed, suppose

b {p} s {q},

i.e. every terminating p-computation of S terminates in a state satisfying q. Then by the second claim of Step 3, every terminating delay free p-computation of T terminates in a statesatisfying q. Thus by the result of Step 5, every terminating p-computation of T terminates in a state satisfying q, i.e.

{p} T {q}.

Similar arguments deal with ~tot and ~1air • The case when S and T have an initialization part So is left to the reader, cl

P r o o f of T h e o r e m 2. We treat the case when S has no initialization part So and T results form S by splitting (/~1; R2) into (Rt); (R2). Our presentation follows the 6 steps outlined in the previous proof.

S tep 1 By an •k-transition, k E {I, 2}, we mean a transition occurring in a computation of T which is of the form

77

< [U~ll . . . l l (n~); Ud l . . . l lU , ] , 0. > -~ < [Ul l l . . . l lUd l . . . l lU . ] , r > .

We call a fragment ~ of a computation of T good if in ~ each Rl-transition is immediately followed by the corresponding R2-transition, and we call ~ almost good if in ~ each Rl-transition is eventually followed by the corresponding R2- transition.

Observe that every fair and hence every finite computation of T is almost good.

S tep 2 To compare the computations of S and T, we use the i/o equivalence introduced in Step 2 of the proof of Theorem t.

S tep 3 We prove the following two claims:

• every (fair) computation of S is i/o equivalent to a good (fair) computation of T,

,, every good (fair) computation of T is i/o equivalent to a (fair) computation of S.

First consider a (fair) computation ~ of S. Every program occurring in a configuration of ~ is a parallel composition of n components. Let for such a program U the program split(U) result from U by replacing in the i-th component of U every occurrence of </~1; /~2) by <nl); (R2>. For example, split(S) -- T.

We construct an i/o equivalent good (fair) computation of T from ~ by replacing

• every transition of the form

< [Ulll. . . l l(nl; n2); U£11...IW~],0" > < [Ul l l . . . l lUd l . . . l lU , ] , r >

with two consecutive transitions

< split([Ulll...ll(Rx; R2); U~II, ..IIU.]),0" > --+ < sp l i~(Wll l . . . I I (R2) ; Ud l . . . l lU , ] ) , 0"1 >

< split([Ulll,. .llUill.. .llV.]), r >

where the intermediate state 0"1 is defined by

< (R1), 0. > ---* < E, o'1 >,

78

• every other transit ion

 ~ < V , r >

with

< spli t(U), > --, < s p l i t ( v ) , r > .

Now consider a good (fair) computat ion r/ of T. By applying the above replacement operations in reverse direction we construct an i /o equivalent (fair) computa t ion of S from r/.

S t e p 4 For the comparison of computat ions of T we use i /o equivalence, but to reason about it we also introduce a more discriminating variant of it called "permuta t ion equivalence".

First consider an arbitrary computat ion ~ of T. Every program occurring in a configuration of ~ is the parallel composit ion of n components. To distinguish between different kinds of transitions in ~, we at tach labels to the transit ion arrow - . . We write

<U,u>

i fk 6 {1, 2} and < U, ~r > --* < V, r > is an Rk-transit ion of the i-th component of U,

if < U, o" > --, < V, r > is any other transition caused by the activation of the i- th component of U, and

i <V,r>

if j # i and < U, cr > ---* < V, r > is a transition caused by the activation of the j - th component of U.

Hence with each transit ion arrows in a computa t ion of T there is a unique label associated. This enables us to define:

Two computa t ions r / and ~ of T are permutation equivalent if

• r] and ~ start in the same state,

• for all states or, r / te rminates in a iff ~ terminates in ~,

• the possibly infinite sequence of labels attached to the transi t ion arrows in 17 and ~ are permutat ions of each other.

79

Clearly, permutation equivalence of computations of T implies their i/o equivalence.

Step 5 We prove the following claim: every (fair) computation of T is i/o equivalent to a good (fair) computation of T.

To this end, we establish two simpler claims.

Claim 1 Every (fair) computation of T is i/o equivalent to an almost good (fair) computation of T.

P roo f of Cl~im 1. Consider a computation f of T which is not almost good. Then by the observation stated in Step t, ~ is not fair and hence diverging. More precisely, there exists a suffix fl of ~ which starts in a configuration < U, ~" > with an/~1-transition and then continues with infinitely many transitions not involving the i-th component any more, say

f~ :< U,~> ~ < U0,~0 > A < U~,¢~ > ~ . . .

where jk # i for k ~ 1. By the definition of semantics of while-programs we conclude the following: if R1 is disjoint from Sj with j ¢ i, then there is also an infinite transition sequence of the form

f 2 : < ~ , ~ > L < v l , n > ~ . . . .

and if R2 is disjoint from S i with j # i, then there is also an infinite transition sequence of the form

~3: ~ <u0,~0 > ~ < v0,r0 > L < v1,~ > ~ . . .

We say that f~ is obtained from ~1 by deldion of the initial Rl-transition and ~3 is obtained from ~1 by insertion of an R2-trausition. Replacing the suf~: ~1 of f by ~ or ~3 yields an almost good computation of T which is i/o equivalent to ~: o

Claim 2 Every almost good (fair) computation of T is permutation equivalent to a good (fair) computation of T.

P roo f of CIMm 2. By the definition of semantics of while-programs the following: ifRk with k E {1, 2} is disjoint from Sj with ] ¢ i, then the relations

R-A and J~ commu~e, i.e.

where o denotes relational composition. Repeated application of this commutativity allows us to permute the transitions of every almost good fragment ~1 of a computation of T of the form

80

~'~ :< ;/,~,> ~ o & o . . . oL~ o ~ < v,~->

with jk ~ i for k E {1, .... m} into a good order, i.e. into

& :< ~-,~> A o. . .o L~ o -~ o ~ < v,~->

o r

¢3:<U,~r> ~ o ~ o o . . . o < >

depending on whether/~1 or R2 is disjoint from Sj with j ¢ i. Consider now an almost good computation ~ of T. We construct from

a permutation equivalent good computation ~* of T by successively replacing every almost good fragment of ~ of the form ~1 by a good fragment of the form & or~s.

Note that a computation ~ ofT is fair iffthere exists a configuration < U, ~, > such that every sequential component of U has either terminated or is activated infinitely often in the suffix of rl starting in < U, o" >. Since this property is preserved by the above construction of a permutation equivalent computation ~* from ~, we conclude: if ~ is fair, also ~* is fair. rn

Claims 1 and 2 together imply the claim of Step 5.

S tep 6 By combining the results of Step 3 and 5, we get the claim of the theorem for the case when S has no initialization part So and T results from S by splitting (R1; /~21 into (RI); (R2). The cases when S has an initialization part So and where T results from S by splitting the atomic region (if B t h e n RI else R2 fi) are left to the reader, rn

The proof of the Initialization Theorem follows the same lines as the proof of the Atomicity Theorem and is therefore omitted.

R e f e r e n c e s

[1] E. Ashcroft and Z. Manna, Formalization of properties of parallel programs, Machine Intelligence 6, pp. 17-41, 1971.

[2] K.R. Apt and E.-R. Olderog, Proof rules and transformations dealing with fairness, Science of Computer Programming 3, pp. 65-100, 1983.

[3] R.J.P~. Back, A method for refining atomicity in parallel algorithms, Lec- ture Notes in Computer Science 366, Springer-Verlag, 1989.

[4]

[5]

[6]

[7]

[8]

[9]

[io]

[11]

[12]

81

M. Chandy and J. Misra, A Foundation of Parallel Program Design, Addison-Wesley, 1988.

E. W. Dijkstra, Guarded commands, nondeterminacy and formal derivation of programs, Communications of the A CM 18, pp. 453-457, 1975.

L. Flon and N. Suzuki, The total correctness of parallel programs, SIAM Journal of Computing, pp. 227-246, 1978.

N. Francez, Fairness, Springer-Verlag, 1986.

M.C.B.Hennessy and G.D. Plotkin, Full abstraction for a simple programming language, Lecture Notes in Computer Science 74, Springer-Verlag, 1979.

L. Lamport, Proving the correctness of multiprocess programs, IEEE Transactions on Software Engineering SE-3:2, pp.125-143, 1977.

R. Lipton, Reduction: a method of proving properties of parallel programs, Communications of the ACM 18, pp. 717-721, 1975.

E. 1%. Otderog and K. I%. Apt, Fairness in parallel programs, the transformational approach, ACM TOPI, AS 10, pp. 420-455¢ 1988.

S. Owicki and D. Gries, An amomatic proof technique for parallel programs, Acta Informatica 6, pp. 319-340, 1976.

Experiences with Combining Formalisms in WSL

C.A. Middelburg P T T Research, Neher Laboratories

P.O. Box 421, 2260 AK Leidschendam, The Nether lands

Abstract

This paper primarily reports on semantic aspects of how a formal specification of the PCTE interfaces has been achieved in a situation where only a combination of existing formalisms could meet the needs. The motivations for combining a VDM specification language with a language of temporal logic, for translating the resulting language, called VVSL, to an extended COLD-K and for translating it also (partially) to the language of the logic MPL~ are briefly outlined. The main experiences from this work on combination and transformation of formalisms are presented. Some important experiences with the application of VVSL to the formal specffication of the PCTE interfaces and otherwise are also mentioned.

Keywords & Phrases: formal specification languages, model-oriented specification, pre- and post-conditions~ inter-conditions, temporal logic, transformational semantics, logical semantics.

1987 C R Categories: D.2.1, D.2.2, D.3.1, F.3.1, F.3.2, FA.I


A large software system often needs a precise specification of its intended behaviour. A precise specification provides a reference point against which the correctness of the system concerned can be established - - either by verifying it a posteriori, or preferably by developing it hand in hand with a correctness proof. A precise specification also makes it easier to reason about the system. Moreover, it is possible to reason about the system before its development is undertaken. This possibility opens up a way to increase the confidence that the system will match the inherently informal, user's requirements. If a change to an existing software system is contemplated, then the consequences of the change have to be taken into account. But without a precise specification, it is often difficult to grasp the consequences of a change.

In order to achieve precision, a specification must be written in a formal specification language. A formal specification language needs a mathemetically precise and complete description of the semantics of the language.

In practice, the creation of a precise specification is sometimes doomed to fail by absence of a formal specification language that meets the needs. In some cases, the problem may be solved by combining several languages. However, it is not sufficient to combine the languages syntactically. In order to achieve a formal specification, they must also be combined semantically. This means that the semantic bases of the languages have to be integrated. This is generally hard, since the bases of many languages are not organized in an orthogonal ~tnd elementary way. Besides, they tend to have different mathematical origins. This paper reports about these matters from the experiences with VVSL [Mid89d].

1.1 B a c k g r o u n d

VVSL is a specification language which combines two other languages both syntactically and semantically. It is the specification language that has been used in the ESPRIT project "VDM for Interfaces of the PCTE" (abbreviated to VIP). This project was concerned with describing in a mathematically precise manner the

84

PCTE interfaces [PCT86], using a VDM specification language as far as possible. The PCTE interfaces have been defined as a result of the ESPlZIT project "A Basis for a Portable Common Tool Environment". The P CTE interfaces aim to support the coordination and integration of software engineering tools. They address topics such as an object management system, a common user interface and distribution. The objectives in producing a formal specification of the PCTE interfaces can be summarised as follows:

* to support implementors of PCTE, tool builders using PCTE primitives, etc. by giving them access to a precise description of the interfaces;

• to identify weaknesses in the PCTE interfaces and to suggest improvements;

• to provide a basis for long-term evolution of PCTE.

These objectives provided the main reasons for structuring the specification of the interfaces:

* Unstructured, the specification will be too large to have any chance of being reasonably understandable by its intended 'users'. Division into 'functional units' with well-defined interfaces enhances understandability.

• Weaknesses in the current design should be identified and improvements suggested. Composing the functional units from instantiations of a small number of orthogonal and generic 'underlying semantic units' supports such improvements.

• PCTE is currently rather language (C) and operating-system (UNIX) specific. Evolution away from these influences will improve PCTE. Isolating the language- and operating-system-orlented parts supports such evolution.

For structuring specifications, VVSL has modnlarization constructs and parameterization constructs. They are very similar to those of the kernel design language COLD-K [Jon89b]. The modularization mechanism permits two modules to have parts of their state in common, including hidden parts. After appraisal of the current trends in modular structuring with respect to the specification of the PCTE interfaces, the structuring features of COLD-K were in high favour with the VIP project. A short survey of the current trends in modular structuring is given in an appendix.

In VDM specification languages, operations may yield results which depend on a state and may change that state. Operations are always regarded as atomic, i.e. not to interact with some environment during execution. Therefore, intermediate states do not contain essential details about the behaviour of an operation. Only the initial state and final state matter. In the case of the PCTE interfaces, not all operations are as isolated as this. For some operations, termination, final state and/or results partly depend on the interference of concurrently executed operations through a partially shared state. In these cases, intermediate states do contain essential details about the behaviour of the operation concerned. Although it may be considered inelegant to have such details externally visible, many aspects of this kind cannot be regarded as being internal in the case of the PCTE interfaces. Adding a rely- and a guarantee-condition (which can be used to express simple safety properties) to the usual pre: and post-condition pair of operations, as proposed in [3on83], was found to be inadequate for specifying the PCTE operations. At least some of the additional expressive power, that is usually found in languages of temporal logic, was considered necessary.

In VVSL, a language for structured VDM specifications is combined with a language of temporal logic in order to support implicit specification of non-atomic operations. The language of temporal logic has been inspired by various temporal logics based on linear and discrete time [LPZ85, HM87, BK85, Fis87]. The design of VVSL aimed at obtaining a well-defined combination that can be considered a VDM specification language with additional syntactic constructs which are only needed in the presence of non-atomic operations and with an appropriate interpretation of both atomic and non-atomic operations which covers the original VDM interpretation.

VVSL without its modularization and parameterization constructs is referred to as fiat VVSL. The structuring sublanguage of VVSL consists of the moduIarization and parameterization constructs complementing fiat VVSL.

85

In the VIP project, VVSL has been provided with a well-defined semantics by defining a translation to COLD-K extended with constructs which are required for translation of the VVSL constructs that are only needed in the presence of non-atomic operations. The report IBM88] contains both the definition of this translation and the definition of the COLD-K extensions. In a follow-up project~ VVSL has been provided with a well-defined semantics in another way. In [Mid89c], fiat VVSL has been given a logical semantics by defining a translation to the language of the logic MPL~v [KI~89]. In [Mid90], the structuring sublanguage of VVSL has been given a semantics by defining a translation to the terms of a calculus, which is obtained by putting a variant of lambda calculus, called ~r-calculus [Fei89], on top of a specialization of a general model of specification modules, called Description Algebra [Jon89a]. MPL~, Description Algebra and ;~r-calculus are also used for the formal definition of COLD-K in [FJKP~87].

1.2 S t r u c t u r e o f t h e P a p e r

Sections 2, 3 and 4 deal informaUy with the combination of a VDM specification language with a language of temporal logic in VVSL. Section 2 presents some features of the VDM specification language; only features that are strongly involved in the combination are treated. Section 3 outlines the motivation for combining the two languages and sketches how this is actually done. Section 4 describes the temporal language in some detail.

Sections 5 and 6 introduce the transformations to the extended COLD-K and the language of MPL~. Section 5 outlines the motivation for transforming VVSL to the extended COLD-K and gives as an example the translation to COLD-K for the operation definitions of the VDM specification language that has been incorporated in VVSL. Section 6 outlines the motivation for transforming fiat VVSL to the language of MPL~ and gives as an example the interpretation in MPL~ for the logical expressions of fiat VVSL.

Sections 7, 8 and 9 deal with formal aspects of the combination. Section 7 sketches the extensions of COLD-K that are required for transforming full VVSL to a COLD-K-llke language. Sections 8 and 9 give as examples the translation to the extended COLD-K for the temporal formulae and the operation definitions of VVSL. For comparison, the interpretation in MPL~ is also given.

2 VVSL: the V D M Specification Language

2.1 C o n n e c t i o n s w i t h o t h e r V D M S p e c i f i c a t i o n L a n g u a g e s

The major VDM specification languages are presented in [BJ82] (VDM specification language with domain- theoretic semantics) and [Jon86] (VDM specification language with set-theoretic semantics). The latter VDM specification language is closely related to Z [Spi88]. The forthcoming standard VDM specification language BSI/VDM SL [VDM88] unifies the major VDM specification languages. A proposal for the formal semantics of BSI/VDM SL is presented in [Lar89]. In the first version of this proposal, the semantics of the STC VDM P~eference Language defined in [Mon85] and the proposal for modularization and parameterization in BSI/VDM SL presented in [Bea88] were taken as the starting point. Inadequacies of the predecessors of this proposal for modularization and parameterization were the main reason to choose something quite different for modularization and parameterization in VVSL. The chosen modularization and parameterization constructs are very similar to those of COLD-K; which is manifest in the translation rules given in [Mid89d]. Meanwhile modularization has been removed from the proposal for the formal semantics of BSI/VDM SL.

The fiat VDM specification language that has been incorporated in VVSL is roughly a restricted version of BSI/VDM SL. It is very similar to the language used in [Jon86]. One can define types, functions working on values of these types, state variables which can take values of these types, and operations which may interrogate and modify the state variables. In the remainder of this section, a short introduction to state variables and (atomic) operations is given. For a more complete presentation, see e.g. [Jon86].

86

2.2 S t a t e V a r i a b l e s a n d O p e r a t i o n s

In the VDM specification language that has been incorporated in VVSL, like in other VDM specification languages, operation is a general name for imperative programs and meaningful parts thereof (e.g. procedures). Unlike functions, operations may yield results which depend on a state and may c h ~ g e that state. The states concerned have a fixed number of named components, called state variables, attached to them. In all states, a value is associates with each of these state variables. Operations change states by modifying the value of state variables. Each state variable can only take values from a fixed type. State variables correspond to programming variables of imperative programs.

S t a t e Var i ab le s A state variable is interpreted as a function from states to values, that assigns to each state the value taken by the state variable in tha t state.

A state variable is declared by a variable definition of the following form:

v : t .

It introduces a name for the state variable and defines the type from which the state variable can take v a l u e s .

A state invariant and an initial condition, of the form

inv El,, and ]nit Ei,i~,

respectively, can be associated with a collection of variable definitions. The state invariant is a restriction on what values the state variables can take in any state. The initial condition is a restriction on what values the state variables can take initially, i.e. before any modification by operations.

O p e r a t i o n s An operation is interpreted as an input /output relation, i.e. a relation between 'initial ' states, tuples of argument values, 'final' states and tuples Of result values.

An operation is implicitly specified by an operation definition of the following form:

op(xl : tl , .. . ,x,~: tn) xn+l: tn+t ,. . . ~Xm: tm ext rd v l : t ~ , . . . , r d v l : : t~ ,wr • ' Vk+ 1 . tk.t . 1 , . . . ,wr VI: t~ pre Epre post E~o,~.

The header introduces a name for the specified operation and defines the types of its arguments and results. The header also introduces names for the argument values and result values to be used within the body. The external clause indicates which state variables are of concern to the behaviour of the operation and also indicates which of those state variables may be modified by the operation. The pre-condition defines the inputs, i.e. the combinations of initial state and tuples of argument values, for which the operation should terminate, and the post-condit ion defines the possible outputs, i.e. combinations of final state and tuple of result values, from each of these inputs. Operations are potentially non-deterministic: the post-condition may permit more than one output from the same input. The pre-condition may be absent, in which case the operation should terminate for all inputs (i.e. it is equivalent to the pre-condition true). In the postcondition, one refers to the value of a state variable v in the initial state by ~- and to its value in the final state by v.

An initial state may lead to a final state via some intermediate states. However, one cannot refer to these intermediate states in operation definitions. The underlying idea is tha t intermediate states do not contain essential details about the behaviour of the operation being defined, since operations are always regarded as being atomic, i.e. not to interact with some environment during execution. Atomic operations may certainly be implemented as combinations of sub-operations, provided that the whole remains insensitive to interference.

87

3 VVSL: Combining VDM and Temporal Logic

3.1 M o t i v a t i o n

Sometimes, operations are not as isolated as this. An important case that occurs in practice is that termination and/or the possible outputs depend on both the input and the interference of concurrently executed operations through state variables. In that case, intermediate states do contain essential details about the behaviour of the operation being defined. Although it is usually considered inelegant to have such details visible, it happens in practice. The PCTt~ interfaces constitute a striking example. A language of temporal logic seems a useful language for specifying such non-atomic operations implicitly.

In VVSL, a formula from a language of temporal logic can be used as a dynamic constraint associated with a collection of state variable definitions or as an inter-condition associated with an operation definition. With a dynamic constraint, global restrictions can be imposed on the set of possible histories of values taken by the state variables being defined. With an inter-condition~ restrictions can be imposed on the set of possible histories of values taken by the state variables during the execution of the operation being defined in an interfering environment.

The temporal language has been inspired by a temporal logic from Lichtenstein, Pnueli and Zuck that includes operators referring to the past [LPZ85], a temporal logic from Moszkowski that includes the chop operator [ItM87], a temporal logic from Barringer and Kuiper that includes transition propositions [BK85] and a temporal logic from Fisher with models in which finite stuttering can not be recognized [Fis87]. The operators referring to the past, the chop operator and the transition propositions obviate the need to introduce auxiliary state variables acting as history variables, control variables and scheduling variabtes~ respectively. The above-mentioned temporal logics are all based on linear and discrete time. Temporal logics based on linear and discrete time are further explored and better understood with respect to their adequacy for specifying interacting parts of imperative programs than temporal logics based on branching time [EH86] and temporal logics based on real time [BKP86, StaB8]. Therefore temporal logics based on branching or real time have not influenced the tempora~ language of VVSL directly. For more details on the temporal language, see Section 4. In the remainder of this section, it is sketched how the VDM specification language and the language of temporal logic are combined in VVSL.

3.2 C o m p u t a t i o n s

For atomic operations, it is appropriate to interpret them as input/output relations. This so-called relational interpretation is the usual one for VDM specification languages. For non-atomic operations, such an interpretation is no longer appropriate, since intermediate states contain essential details about the behaviour of the operation; e.g. the possible outputs depend on the input as well as the interference of concurrently executed operations through state variables. Non-atomic operations require an operational interpretation as sets of computations wlfich represent possible histories of values taken by the state variables during execution of the operation concerned in possible interfering environments.

A computation of an operation is a non-empty finite or infinite sequence of states and connecting labelled transitions. The transition labels indicate which transitions are effected by the operation itself and which are effected by the environment. The transitions of the former kind are called internal steps, those of the latter kind are called external steps. In every step some state variables that axe relevant for the behaviour of the operation have to change, unless the step is followed by infinitely many steps where such changes do not happen. In other words, ~finite stuttering' is excluded. In the case of an internal steps the state variables which change can only be write variables. In the case of an external step, they can be read variables and write variables. The computation can be seen as generated by the operation and the environment working interleaved but labelled from the viewpoint of the operation.

The introduction of transition labels for distinguishing between internal and external steps is significant. Such a distinction is essential to achieve an open semantics of a non-atomic operation, i.e. a semantics which models the behaviour of the operation in all possible environments. The kind of transition labelling, which is presented here, is introduced by Barringer, Kuiper and Pnueli in [BKP84].

88

The exclusion of finite stuttering corresponds to the view that if nothing actually happens then one can not tell that time has passed, unless nothing happens for an infinitely long time. It makes computations much like computations in 'real time' models based on the view that things happen at a finite rate, viz. the model of the temporal logic of the reals with the 'finite variability' restriction [BKP86] and the model of the temporal logic for 'conceptual state specifications' with the 'local finiteness' restriction [Sta88].

3.3 I n t e r - c o n d i t i o n s

In full VVSL, an operation is implicitly specified by an operation definition of the following form:

ext rd v l : t ' 1 . . . . ,rd vk:t~,wr • ' vk+~. t~+ I , . . . ,wr vl: t[ pre Epr, post E~o,t inter cpi~r.

That is, an inter-condition is added to the usual operation definition. This inter-condition defines the possible computations of the operation.

For atomic operations, only the relational interpretation is relevant. Therefore the relational interpretation of an operation is maintained in VVSL. This interpretation is characterized by the external clause (for atomic operations), the pre-condition and the post-condition. The operation has in addition the operational interpretation, which is mainly characterized by the external clause (for non-atomic operations) and the inter-condition. The inter-condition is a temporal formula which must hold initially for the computations from the operational interpretation. This corresponds to a notion of validity for temporal formulae which is 'anchored' at the initial state of the computation (see [MP89]). The inter-condition can be used to express that the operation is atomic. However, this may also be indicated by leaving out the inter-condition. TMs means that atomic operations can be implicitly specified as in other VDM specification languages. The possible computations of an atomic operation have at most one transition and their transitions are always internal steps.

The computations from the operational interpretation must agree with the relational interpretation. To be more precise, its finite computations must have a first and last state between which the input/output relation according to the relational interpretation holds and its infinite computations must have a first state which belongs to the domain of this relation. The inter-condition expresses a restriction on the set of computations that agree with the relational interpretation. The requirement on the infinite computations means that the pre-condition does not always define the inputs for which the operation necessarily terminates (in any valid interpretation). For non-atomic operations, the pre-condition defines the inputs for which the operation possibly terminates. In other words, it defines the inputs for which termination may not be ruled out completely by interference.

For non-atomic operations the values taken'by a read variable in the initial state and the final state must be allowed to be different, since a read variable may be changed'by the environment. This has as a consequence that the external clause does not contribute to the characterization of the relational interpretation of non- atomic operations. It contributes only to the characterization of the operational interpretation. Kead variables cannot be changed during an internal step but can be changed during external steps. Write variables can be changed during any step. Only read and write variables are relevant for the behaviour.

With the combined possibilities of the external clause and the inter-condition, non-atomic operations can be defined while maintaining as much of the VDM style of specification as possible,

The pre~condition of a non-atomic operation only defines the inputs for which the operation possibly terminates. This allows that the operation only terminates due to interference of concurrently executed operations. Moreover, the post-condition of a non-atomic operation will be rather weak in general, for inputs must often be related to many outputs which should only occur due to certain interference of concurrently executed operations. The inter-condition is mainly used to describe which interference is required for termination and/or the occurrence of such outputs.

89

Apart from finite stuttering, the operational interpretation of interfering operations characterized by a rely- and a guarantee-condition, as proposed in [Jon83], can also be characterized by an inter-condition of the following form:

inter n((is-E =~ OcPrel~) A (is-/ =~ O ~ r ) ) ,

where the temporal formulae ~,el~ and cpg~, are the original rely- and guarantee-condition with each oc-

currences of an expression ~" replaced by the temporal term @ v. Rely- and guarantee-conditions can only be used to express iuvariance properties of state changes in steps made by the environment of the operation concerned and invariance properties of state changes in steps made by the operation itself. This is often inadequate; e.g. for operations that should wait until something occurs, such as some PCTE primitives.

3 .4 D y n a m i c C o n s t r a i n t s

In full VVSL, a dynamic constraint, of the form

dyn ~Pd~,~,

can be associated with a collection of variable definitions. A dynamic constraint is a restriction on what histories of values taken by the state variables can occur.

The role of dynamic constraints is similar to that of state invariants. State invariants impose restrictions on what values the state variables can take. Therefore they should be preserved by the relational interpretation of all operations. Dynamic constraints impose restrictions on what histories of values taken by the state variables can occur. Likewise they should be preserved by the operational interpretation of all operations. A dynamic constraint is a temporal formula which must hold always for the computations of any operation.

4 VVSL: the Language of Temporal Logic

In this section a short overview is given of the language of temporal logic that can be used in VVSL. The temporal language is treated in isolation, i.e. the connections with the remainder of VVSL (sketched in Section 3) are reduced as far as possible.

4.1 T e m p o r a l F o r m u l a e

The syntax of the temporal language is outlined by the following producLion rules from the complete grammar of VVSL, which is given in Chapter 3 of [BM88]:

-~7~ [ 7~1 V ~ 3x E t - ~ ] let z : t~ r inT~ ,

~ - : : = e I o ~ - l e ~ - t ' ( r , . . . . ,~-~) .

In order to be a well-formed temporal formula, a temporal term r (third alternative of first production rule) must have type B (which denotes the set of boolean values).

4.2 C o m p u t a t i o n s

Computations are rather loosely described in Section 3. More accuracy is needed for a description of the intended meaning of temporal formulae.

A model of a complete VVSL specification is a structure .A in which, among other things, a special pre-defined name State is associated with a non-empty set State ~ (of states).

A (labelled) computation w.r.t. A is a pair (a,A) where a is a non-empty finite or infinite sequence over State ~4 and A is a sequence over the set {I, E} (of transition labels) whose length is I less than the length

90

of a, if a is finite, and is infinite otherwise. The transitions labels correspond directly to the two transition propositions is-I (is internal step) and is-E (is external step).

The usual representation of a finite computation ((so . . . . . sn), ( l o , . . . , In-1 )) is

In-1 So ~ 81--.+....-~Sn_ 1 -.4 Sn,

and the usual representation of an infinite computation ((so, s ~ , . . . ) , (Io, 11,...)) is

~0 k s~ k . . . . In_i

Yd7 = so to s l -'~ . . " ~ s n - ~ --* s~ then the length ofT, 171, is defined to be the number of states in 7, i.e. ]7] = n + I . If 7 is infinite, we write ]7] = w.

Ii-1 Furthermore, the notations pre f (7 , i ) and s u f f ( 7 , i ) are used to denote so ~ s~ . . . . --+si_~ ~ s~ and

ti t~_~ l~ ti+_,~ . (in the infinite case), respectively. sl --* si+~--~ " ' " - * s n - t ~ s~ (in the finite case) or s~ --* si+~ . .

4.3 Sa t i s f ac t i on of T e m p o r a l F o r m u l a e

The notation (7, i) ~g ~ will be used to indicate the truth of temporal formula ~ at position i in computation 7 under assignment g. By an ass ignment is meant a function which assigns to each value name (i.e. variable in the mathematical sense) a value belonging to the appropriate type.

The meaning of the temporal formulae is now outlined by the inductive rules for the temporal operators ; (chop), O (next), b/(until), ® (previous) and ~q (since) from the definition of satisfaction:

(7,i) I--~ ~ , ; ~ e iff for some j , i < j < 171, ( p r e f ( 7 , j ) , i ) I = a V, and ( s u f f ( 7 , j ) , O ) [--9 ~ , or 171 = w and (7, i) ~g TI,

(7, i) I--, o ~ iff i + ~ < 171 and (7, i + 1) ~u ~,

(7, i) ~g ~lH~e iff for some k, i _< k < 171, (7,k)I=g ~e and for every j , i _< j < k, (7,J) ~9 ~I ,

(7, i) I =0 o ~o iff i > o and (7, i - I ) ~0 ~,

(%i) ~g ~ojS~ iff for some k, 0 _< k _<: i, (7,k) ~9 ~ and for every j , k < j <: i, (7,J) ~g ~I-

The rules for the logical connectives and quantifiers are as usual.

The notations <>~ (eventually), Q ~ (henceforth) and their counterparts for the past are defined as abbreviations:

<~p Z~ trueH ~p,

<~(p Z~ true~O,

G~ =~ , ( * ~ ) .

5 T r a n s f o r m i n g V V S L t o C O L D - K


C0LD-K provides modularization and parameterization mechanisms which are adequate for writing large specifications in state-based styles and have firm mathematical foundations. This modularization mecha-

91

nism permits two modules to have parts of their state in common, including hidden parts. COLD-K is a formal specification language which is meant to be used as the kernel of user-oriented versions of the language (attuned to e.g. different styles of specification or different implementation languages), each being an extension with features of a purely syntactic nature. A VDM specification language that is restricted to first-order functions can be considered to be a user-oriented version of COLD-K.

This means that by giving a translation from the flat VDM specification language that have been incorporated in VVSL to COLD-K, one gets 'for free' suitable features for structuring VDM specifications. Besides, it is obvious that for the most part this translation is relatively easy. In other words~ apart from the constructs for the definition of non-atomic operations, to provide VVSL with a well-defined semantics by defining a translation to COLD-K is an attractive approach in case a well-defined semantics must be made avsJlable at short notice.

Because of the combination with a language of temporal logic, the situation is more complicated. The additional constructs cannot be translated to COLD-K. COLD-K has to be extended first. It is far from obvious that COLD-K can be extended straightforwardly to a suitable basis for full VVSL~ but insoluble problems are not to be expected either. This complication makes the approach less attractive, but it remains a reasonable alternative under the constraints of the VIP project which hardly allow to develop a semantic basis for VVSL.

5.2 T r a n s l a t i o n ru l e s

T h e translation from VVSL constructs to COLD-K constructs has been defined by means of schematic production rules, called t r a n s l a t i o n ru les . Presenting the definition of the translation in this way, emphasizes the syntactic nature of the trazaslation.

The left-hand side of a translation rule is a VVSL construct enclosed by the special brackets (,), which may contain variables for subconstructs. The right-hand side is a COLD-K construct, which may contain these variables enclosed by the special brackets ( ,) for subconstructs (except for variables ranging over constructs solely consisting of an i d e n t i f i e r , which may occur without enclosing brackets). The left-hand side and right-hand side of a translation rule are separated by the arroyo =~.

The translations of a VVSL construct C are the terminal productions of (C~. In general, the translation is not unique.

The special brackets (,~ denote a t r a n s l a t i o n o p e r a t o r which maps meaningful VVSL constructs to meaningful COLD-K constructs. The resemblance of the special brackets with the 'semantic brackets' ~,] is intentional. It is meant to strengthen the intuition of translation operators as meaning functions. In the complete definition of the translation, an auxiliary translation operator is used, which is denoted there by the special brackets ~I,]}. Thus the translation of the declarative aspects of definitions and the translation of the definitional aspects of definitions could be split.

5.3 E x a m p l e

An example of a VVSL construct with straightforward translation to COLD-K is the operation definition for atomic operations. The translation is outlined by the following translation rules from the complete definition of the translation from VVSL to the extended COLD-K, which is given in Chapter 3 of IBM88]:

~op(*1 : t~ , . . . , , ~ : t , ) *~+I : t ,+1 . . . . . zm: tm

ext rd vl : t~ . . . . . rd vk: t~ , wr vk+ l : t~+ I . . . . , wr v/: l ; pre E1 post E$]) =~ p r o c o p : t l × ' " × t , ~ t mod vk+s:-~t~+ 1 , . . . , v t : ~ t ~ axiom

forall x l : t! . . . . . xn: tn ( ( E l ) = true =~ ( ( op(:cz . . . . , x . ) ) t rue ) ) axiom forall z1: t l , . . . , x~ : t ,

( (E l ) = true ~ ([let xn+l: t ,+~ . . . . . z , ~ : t m ; z~+~ . . . . . xm:= o p ( z i , . . . , z , ) ] ~[E~) = true)).

92

Especially the COLD-K modification rights construct and its assertion constructs corresponding to the box and diamond operators of dynamic logic [Har84] make this translation straightforward. Acquainted with dynamic logic, the resemblance with the definition of satisfaction for operation definitions in Appendix C of [Jon86] seems clear. However, the box and diamond operators of dynamic logic are not exactly those of COLD-K. This means that it is not trivial to show that this translation captures the intended meaning of operation definitions, although it may be intuitively clear.

6 T r a n s f o r m i n g V V S L t o t h e L a n g u a g e o f M P L w


For various VVSL constructs, translation to COLD-K is not straightforward. Amongst the less obvious to translate are the logical expressions. Because their value can be either true, false or undefined~ the classical meaning of the logical connectives and quantifiers has to be extended. This must be done as in LPF (see [Che86, Jon86]):

-~E is true if E is false E V E ' is true if E is true or E ' is true is false if E is true is false if E is false and E ' is false is undefined otherwise, is undefined otherwise,

3x E t • E is true if for some value c of type ~, E is true when x is interpreted as c

is false if for each value c of type t, E is false when x is interpreted as c

is undefined otherwise.

The other logical connectives and quantifiers are expressible by -~, V and B in the classical way.

The approach to the translation of logical expressions is connected with the treatment of three-valued predicates in classical two-valued logic which is described in [Bli88].

The translation is outlined by the following translation rules from the complete definition of the translation from VVSL to the extended COLD-K, which is given in Chapter 3 of [BM88]:

some yl : B (foral ly~:B(({E)=true ~* y~=false) a n d ( { Z ) = f , lse ~* y,=true) ¢* y , = y ~ ) ) ,

some Yl : B (forall y~: B (({EI} = true or (E~) = tr-e ~* y~ = tr-e) :nd ((E~)=f:lseand ~E:}=~alse ~* y: =false) ~* y~ = y:)),

some Yl : B (forall y~: B

((exists x : t (~E]) = true) ~ y~ = true) and (forall x: t (~E) = false) ¢¢ y~ = false) ¢¢" Yl = Y$)).

In order to express "the unique y of sort T snch that assertion A holds" in COLD-K one has to write some yl: T (forall y: T (A ~ yl = y)).

With this in mind it is intuitively clear that this translation captures the intended meaning for all cases that should not yield an undefined result. The other cases are not intuitively clear. In order to show (even informally) that the translation captures the intended meaning completely, the translation from VVSL to COLD-K has to be composed with the translation from COLD-K to the language of MPL~ or a complete proof system for a 'COLD-K logic' (with COLD-K assertions as formulae) has to be devised. The first alternative results in a direct interpretation in MPL~. This means that it provides an interpretation accessible to a larger public. After all, MPL~ is well related to classical first-order logic.

93

For various other VVSL constructs, it is also difficult to show that the translation to COLD-K captures the intended meaning. Further translation to MPLw seems needed in all cases.

6.2 E x a m p l e

The interpretation of logical expressions in MPL~ is context dependent. The notation [[E~y is used to denote the MPL~ formula expressing the fact that the evaluation of the logical expression E in a context where we have visible names as given by C and state(s) ~" yields value y. B is used as a special sort symbol representing the domain of boolean values, and/t and ff are used as special constant symbols representing the boolean values.

The interpretation of logical expressions in MPL~ is outlined by the following defining equations from the complete definition of the interpretation of flat VVSL in MPL~, which is given in [Mid89c]:

~i 3 x E t • E]~t, ~ := Vy': B((3x': t C ( [ E ] ~ u{~}) ~+ y' = / t ) A (Vx': tV([E]~,~ {d}) *-* y' = i f ) ~ y' = y).

In each of these equations, y~ is a fresh variable symbol of MPL~. In the last equation, x r is a fresh variable symbol of MPL~ corresponding to the value name ~ (this correspondence is fixed in the 'declaration' d).

Although there is a striking resemblance between the translation to COLD-K and the interpretation in MPI.~, there is a big difference. It is easy to show that for a~ cases that should yield an undefined result, the right-hand sides of these equations are logically equivalent to Vy~: B(y I ~ y), which is in turn equivalent to -~(Yl), i.e. y is undefined.

7 C O L D - K Ex tens ions

In Section 3, it is sketched how a VDM specification language is combined with a language of temporal logic in VVSL. In order to formalize this~ an extended COLD-K as welI as the translation of the additional constructs to the extended language have been defined. In this section, the extensions of COLD-K are sketched. In Sections 8 and 9, the translation to the extended COLD-K for the temporal formulae and the operation definitions of VVSL is outlined.

The required COLD-K extensions relate to the mathematical foundations, the language constructs, and their meaning. They are formally defined in Chapter 4 of [BM88]. In this section, only aspects with a close connection to the temporal language of VVSL are briefly outlined. Some familiarity with the mathematical foundations of COLD-K is assumed. They are given in [KI~89, Jon89a, Fei89].

7".1 T h e M a t h e m a t i c a l F o u n d a t i o n s

P~oughly, modules in COLD-K (calIed classes) correspond to presentations of MPLw theories, called class descriptions. These theory presentations are of a special kind, since there are always special standard symbols with associated axioms. There are the special sort symbol State representing the state space, the special function symbol sO representing the initial state, and special predicate symbols of several kinds representing relations on states. This allows program variables to correspond to functions with an argument of sort State and procedures to correspond to predicates with two arguments of sort State. For the extended COLD-K~ we have to generalize from class descriptions. That is, additional special standard symbols are needed.

The following additional symbols are introduced:

1. Comp: a special sort symbol; representing the domain of computations.

94

2. st,, (for all n < w): a special function symbol; st ,(e) represents the (n + I)-th state of computation c.

3. int, (for all n < t~): a special predicate symbol; int,(c) indicates that the (n + l ) - th state transition in computation c is an internal transition.

4. ext, (for all n < w): a special predicate symbol; ext,(c) indicates that the (n + l ) - th state transition in computation c is an external transition.

5. CComp: a set of variable symbols, which are called computation symbols and represent computations.

6. cornpp (for all p E CProc): a special predicate symbol; compp(zi , . . . ,x , , c, Yx,. . . , ym) indicates that the procedure call p(xl,. . . , ~) (executing interleaved with an environment) can generate computation c yielding objects Yl , - . . , Ym.

7.2 T h e L a n g u a g e C o n s t r u c t s a n d t h e i r M e a n i n g

The additional constructs are mainly assertions and expressions concerning computations. For the most part, they have COLD-K assertions and expressions concerning states as counterparts. The production rules for temporal assertions comprise the production rules for COLD-K assertions and production rules for assertions corresponding to the temporal formulae of VVSL. Similarly, the production rules for temporal expressions comprise the production rules for COLD-K expressions and production rules for expressions corresponding to the temporal terms of VVSL.

A temporal assertion or temporal expression has a context-dependent meaning. Like a COLD-K assertion or expression, the meaning in given context is a MPL~ formula. The notation [P|~k is used to denote the MPL~ formula that expresses the fact that the temporal assertion P holds at position k in computation c, in a context where we have visible symbols C. In [BM88] the notation form(P, C, c, k) is used instead of [P]][k" The former notation is in the style of [FJKtt87]. However, the latter notation is in conformauce to the one used to denote the MPL~ formulae corresponding to temporal formulae from the temporal language of VVSL. The notation close(~, C) is used to denote the existential closure of MPLw formula 7" with respect to the variable symbols that occur free in 7' but are not in C. In [BM88] the notation cform(P, C, c,k) is used for close(~P]~, C). The existential closure of MPL~ formulae is used to deal properly with the liberal scope rules for the names introduced by the let-expression of COLD-K.

~r thermore, the notation prefix(c, c', k) is used to denote the formula that expresses the fact that computation c' is the prefix of computation c ending at the (k + 1 )-th state of c, and the notation suffix(c, c', k) to denote the formula that expresses the fact that computation c' is the suffix of computation c starting at the (k + I)- th state of c.

The interpretation of temporal assertions in MPLw is outlined by the following defining equations from the complete definition of the interpretation of the temporal assertions and temporal expressions in MPLw, which is given in Chapter 4 of [BM88]:

[P chop QLCk := 3ci : Comp 3c~: Comp (V.(prefix(c, e,, n) ̂ suZ~(c, c~, n)) ̂ clos~(lPl~,k, C) ̂ close(lIQ]]~,0, C)) v A.(st.(c) ~) ̂ elose(IrP]l~, c),

~.e×t P~c k := stk+,(c)~ ^ctose(~PlC~+l, C), n--I

~rp until Q]~k := V-(stk+n(c) j~ Ael°se([Q]Ck+., C) A A (cl°se([Pl~k+m' C))), ~*----0

~'prev P]ck := close([piCk_s, C) if k > O, .L otherwise,

k l - 1

~p,~,ce Ql~k := V (clo,~(~QlL_,, c) ^ A (~Zose([Pl~k_~, C))). t=0 m-.ml

95

In the first equation, cj and c$ are fresh computation symbols.

It is clear that the interpretation of these temporal assertions in MPL~ is conformable to the intended meaning of the corresponding temporal formulae of VVSL described in Section 4. This means that the temporal language of VVSL and the temporal assertion language added to COLD-K are very closely connected. Because the extension of COLD-K with a temporal assertion language is only meant to obtain a semantic basis for full VVSL, it makes no sense to devise a rather different temporal assertion language.

Amongst the interesting temporal assertions that do not correspond to temporal formulae of VVSL are the temporal assertions of the form [X]P, where X is an expression (statement) and P is a temporal assertion. This makes the temporal assertion sublanguage of the extended COLD-K resembling process logic [IIK82].

The other additional constructs are also constructs concerning computations which have original COLD-K constructs as counterparts: an extension of the constrained procedure bodies of COLD-K to non-atomic procedures and an extension of the a~oms of COLD-K to computations.

8 Transforming Temporal Formulae

In this section the translation of the temporal formulae of VVSL to the temporal assertions of the extended COLD-K is outlined. For comparison, the direct interpretation in MPL~ is also sketched.

8.1 T r a n s l a t i o n t o t h e E x t e n d e d C O L D - K

The translation to the extended COLD-K for the temporal formulae is simple, due to the extension of COLD-K with corresponding temporal assertions.

The intended meaning of temporal formulae as described in Section 4 does not cover undeflnedness. Like logical expressions, their value can be either true, false or undefined. The intention is actually that the logical connectives and quantifiers distinguish between false and undefined as described for logical expressions in Section 6, while the temporal operators identify false and undefined. Extending the meaning of the temporal operators in the same way as the classical logical operators would yield very obscure results.

The translation is outlined by the following translation rules from the complete definition of the translation from VVSL to the extended COLD-K, which is given in Chapter 3 of IBM88]:

some y l : B (forall y2:B ( ( ( ~ 1 ] ) = true chop ~ z ~ - - t rue ) J~- Yz -- true) ~ Yl = Y~,)),

~o ~ =~ some y I : B (forall y~:B (((next ~ 1 ~ = true) * y¢ = true) 4~ y, = y$)),

some y , : B (forall y~:B ((((~1} = true until ~ z } -- true) ~- y~ =true) ~ Yl = Y~)),

some y t : B (forall y$:B (((prev ~cpl } = true) ~ y~ = true) ¢~, Y1 = Y~)),

some y l : B (forall y$:B ((({~01} = true since ~ } = true) ~ y~ = true) ¢~ Yl = Y~)).

Because the temporal language of VVSL and the temporal assertion language added to COLD-K are very closely connected, this translation trivia3y captures the intended meaning. IIowever it is not explanatory. It may seem that a direct interpretation of temporal formulae in MPLoj (sketched below) is preferable. tIowever, that interpretation does not fit together with the use of COLD-K as the starting point of a suitable semantic basis for full VVSL. In other words, the indirect interpretation of the temporal language in MPL~ is needed to be able to do the same for the VDM specification language (which is motivated in Section 5).

96

8.2 I n t e r p r e t a t i o n in M P L ~

As indicated above, almost no additional effort is required to obtain a direct interpretation of temporal formulae in MPL~.

This is outlined by the following defining equations from the complete definition of the interpretation of fiat VVSL in MPL~, which is given in [Mid89c]:

2 c ~ ; ~ ~¢,~,~ := Vy': B

(((BC,~ : Comp ~c~: Comp (Vu(prefiz(c, et , n) A suj~x(c, c, , n)) A [~o ~c,,l~,u A

h~(~t~(~)l) ^ c u) ~ = ~),

[[0 C C y~ y~ ~o]o,k, ~ := vy':8((st~+~(c)~ ^ [~]o,k+~,~ ~ = u) ~ = y),

, - I ~. C C yl __ yl ..~ [~1 U ~'~11~,, := Vy': B((V,,(stk+,,(c),t ^ I]'~ ]]c,~+,,,,, ^ A ([~, L~+,,,,,,)) '-* ~) ~ y),

m=O

~[@ C , . C ~ y, = ~) ~ yl = ~]0,k,~ := vy .8( (~ lo ,k .~ ,~ y) if k > 0 ff = y otherwise,

I~ 1 - 1

Ibl s ~'~]~S,, := v Y ' : B ( ( V ~ c (~v ],,k-x,, ^ A (Ib,]]S-m,,)) ~ y '= a) '-' y '= y). 1=0 m = l

In each of these equations, yt is a fresh variable symbol of MPL~. In the first equation, cl and c~ are fresh computations symbols.

Owing to the close connection between the temporal formulae of VVSL and the temporal assertions of the extended COLD-K, there is almost no resemblance between the translation to the extended COLD-K and the interpretation in MPL~. Intuitively, the former mainly solves some rather elementary differences between VDM specification languages and COLD-K and the latter mainly assigns (logical) meaning.

9 T r a n s f o r m i n g D e f i n i t i o n s o f ( N o n - a t o m i c ) O p e r a t i o n s

In this section the translation of the operation definitions for non-atomic operations to the procedure definitions and axioms of the extended COLD-K is outlined. For comparison, the direct interpretation in MPLo, is also sketched.

9.1 T r a n s l a t i o n t o t h e E x t e n d e d C O L D - K

The translation to the extended COLD-K for the operation definitions for non-atomic operations is simple, due to the extensions of COLD-K with corresponding constructs: for expressing modification rights, an extension of the constrained procedure bodies of COLD-K to non-atomic procedures is added, and for characterizing computations, an extension of the axioms of COLD-K to computations is added.

The translation is outlined by the following translation rules from the complete definition of the translation from W S L to the extended COLD-K, which is give in Chapter 3 of [BM88]:

{op(~i : tl . . . . . ~ : &) z .+l :t~+1 . . . . . x~: t~ ext rd v1:t~ , . . . , r d vk:t~, wr vk+l:q+l . . . . . wr vt:t~ preE! post E~ inter ~} ==~

proc op:ti × . . . × & --* t rood ext vl:--+ t~ , . . . ,vk:--+ t~ int vl:+l:---~ t~+ 1 . . . . ,vl:--+ t~ axiom foral~ ~,:t,,...,~,,:t. f i E , } = t . ,~ ~ f l o p ( = , , . . . , = . ) ) t ~ u e ) )

axiom

97

forall xl : tt , . . . , ~ : t~ ( { E l } = true =¢. ( [ le t t n + l : t . + i , . . . , x r a : t r a ; t . + l , . . . , t m : = op(z l . . . . , z . ) ] {E$ } = true))

caxiom foral l Xl: t I , . . . , z , : t , ( { E l } = true =¢- (( op(=l . . . . . x , ) > true until not(next true)))

caxlom forall ~:z : t ~ , . . . , = , : t ,

( { E ~ } = true =~ ( [ let =n+z: t .+~ . . . . . xr~: tin; ~c.+I . . . . . ~r~:= op(zz . . . . . zn) ] {~o} = true)).

Here shows the lack of integration inherent to the use of COLD-K as the starting point of a semantic basis for full VVSL. As far as modification rights are concerned, definitions of atomic operations and deft- nitions of non-atomlc operations must be translated to different kinds of constrained procedure bodies. A smooth generalization of the original kind was not possible. For the same reason, two kinds of axioms are distinguished.

Semantically, this means that the indirect interpretation is actually twofold, while the original one can be derived from the new one. A single direct interpretation in MPL~ (sketched below) seems more appropriate.

9.2 I n t e r p r e t a t i o n in M P L ~

The direct interpretation of operation definitions (for atomic operations and non-atomic operations) in MPL~ is much simpler than the indirect one. It consists of a formula corresponding to the external clause and a formula corresponding to each of the conditions from the definition.

The interpretation is outlined by the following defining equations from the complete definition of the interpretation of flat VVSL in MPLoj, which is given in [Mid89c]:

lop(=1 : t , . . . . , , , : t , ) ,~+~ : t ,+ l , . . . ,* , , : t~ ext rd v~ : t; , . . . ,rd v/~: t~, wr vl~+1 : t~+ t , . . . ,wr vl: t[ pre Zz post E~ inter ~,]c :=

{ ~ . . . . , ~ } , where:

C * . C . t . C i . C <Pl = Vx~ : t~ , . . . , x n. tn , c. Comp~ xn+ 1 . tn+ 1 , . . . , x;n. t m 0 C l l $ l

( ~ ; c ~ ~ , c . ~ , c ~ ~ , c ( Z , . . . . . x. ,m=.+l , - - - ,* ; . ) - ' , , ,nod ({~,,. . . , ~},( ,k+,, . . . ,~,}, c)),

t C t C ¢p~ = V s : S t a t e , x l : t I ~ . . . , x ~ : t n . . ~ ~C'u{dl.... tin} [1~'~ J(,,),n ' -'* 3c: Comp, 4 + I : tnC+, . . . . , ~ : t c

(s to (c ) = s ^ -~(A~(stk(c)J.)) ^ o " c t , ' z ' . . . , . ~ ) ) ) , F~C~. ~ C ~ C ~ ~ , C k ~ ; ~ t , ' . . , g : ~ , c , n+l~

~oS Vs: State, z~ : ~ C _ , . ~ C ( r ~ zCu(dl,...,d,}

l = * .gO,),e v:: Co~p, ,'+~ : t~+~ , . . : , = ' : t~

( s to (c ) = s A -,(hk(stk(c)~)) A

e~Cx.. .x~C_.~c, x.. ,x~E~. ~ . . . . . z . , c , * . +~ . . . . . * ~ ) - * J 114.1

r E ~lCU{dl,.,.,dm}'~\~ 9t:State(V~(st~(c ) = t A -~(st~+~(c)l)) A [ ejj(,,O,n ))) ,

~4 Vs :S ta te , x~l : ~ c j . ~ c ( ~ ~,,, ~cu{~,,.. . ,~.}

V~= £omp, ,,'+~ : t e+~,. . . , =,,.,. t,, o . C (g~t I I • ~CU{dI,.. d r n } x x ( ~ t o ( e ) = s ^ ~ c ~ ~ c , c ~ ~,c~ ~ . . . . , z . , c , z . + ~ , . . . , * ~ ) "~ tt~h,~.~ ' )) .

In these equations, ~[ is a fresh variable symbol of MPL0~ corresponding to the value name zi (this correspondence is fixed in the declaration dl). s and t are fresh state symbols (i.e. variable symbols representing states), and c is a fresh computation symbol. The notation v m o d C ( R , W , c) is used to denote the formula that expresses the fact that during each step in computation c variables from R U W are changed and during internal steps variables other than variables from W are not changed.

98

These formulae reflect the intended meaning clearly. Formulae ~ , ~ and 7~s generalize the interpretation of external clause, pre-condition and post-condition from pairs of states to computations. Formulae ~s and ~4 are similar, but the former (corresponding to the post-condition) deals only with the first and last state of computations and the latter (corresponding to the inter-condition) deals with computations as a whole.

The direct interpretation shows the integration which is made possible by the use of MPL~ as the starting point of a semantic basis for full VVSL. Definitions of both atomic and non-atomlc operations are translated to formulae of the same shape. Atomic operations are not treated differently from non-atomic ones. They are considered to have the default inter-condition Otme =~ (is-I A O "~ O true).

Semantically, this means that the direct interpretation is not twofold. Only the new one is left, but the original can be derived from it by abstracting from the intermediate states.

10 Experiences wi th the Appl icat ion of VVSL

In the VIP project~ VVSL has been used for the formal specification of the PCTE interfaces. Some experiences with this application of VVSL seem worth mentioning.

The division of the PCTE interfaces into functional units with well-defined interfaces~ which could be reached by the use of the modularlzation and parameterization constructs of VVSL, has made the complexity explicit tha t is inherent in PCTE. This complexity was kept implicit in [PCT86].

The composition of these functional units from instantiations of a small number of orthogonal and generic underlying semantic units, requires the presence of suitable underlying semantic units. Devising them has revealed what the PCTE way of looking at coordination and integration of software engineering tools exactly is. This way of looking was hided from outsiders in [PCT86].

Experienced specifiers were used to have only a local definition mechanism available as structuring feature. For them, it was rather difficult to master a new style that takes advantage of the available modularizatlon and parameterization mechanisms. Unexperienced specifiers could master such a style much easier.

Although the temporal language of VVSL was designed for use in the VIP project, it is a temporal language of general utility. In working on the formal specification of the PCTE interfaces, several frequently occurring patterns of inter-conditions were recognized. Some ~notational conventions' were added to get the syntax tailored to those patterns.

In a follow-up project, VVSL has also been used to formalize many of the basic concepts of the relational data model, an abstract external interface of a relational database management system and an abstract internal interface for the same system [Mid89b, Mid89a]. These formalizations are meant to provide for examples of the use of VVSL that are accessible to a larger public than the formal specification of the PCTE interfaces.

The idea of composing functional units from instantiations of a small number of orthogonal and generic underlying semantic units is elaborated in these specifications. A firm conclusion is that this approach not only supports adaptability of the specification concerned. It also supports reusability of its parts and it enhances comprehensibility. There is a strong connection between these effects and the goals of modularization techniques which are identified in [FJ90]/ The related idea of using separation of state- independent aspects and state-dependent ones as a guideline in the division into functional units is also elaborated in these specifications. The impression is that the use of this guideline strengthens the effects mentioned above. For example, the specification might be understood more globally.

Also in working on the formal specification of the internal interface [Mid89a], several frequently occurring patterns of inter-conditions were recognized. These patterns differ considerably from the frequently occurring patterns which were recognized in on the formal specification of the PCTE interfaces.

1In [FJg0], these goals lead the authors to suggest criteria which should govern the choice of modular structure in a specification: comprehensibility of individual modules~ suitability of modules for re-use, and intuitive clarity of the modular structure.

99

11 Conclusions and Final Remarks

The language obtained by combining a language for structured VDM specifications with a language of temporal logic as sketched in this paper, has proved suitable for the formal specification of the PCTE interfaces [VIP88a, VIP88b]. VVSL, in particular the temporal language that can be used in VVSL, has been improved in the course of the work on the formal specification of the PCTE interfaces based on the feedback by the specifiers about their actual needs. This led to various preliminary versions of VVSL. This paper is concerned with the the final version. The preliminary versions of VVSL were also developed by the author. It is worth mentioning that the preliminary version of VVSL described in [Mid87] and the language described under the name EVDM in [01i88] are the very same.

To provide VVSL with a wel-defined semantics by defining a translation to an extended COLD-K turned out to be a viable approach. A well-defined semantics was available in time. However, it has two important and related disadvantages:

* for various language constructs, it is difficult to show that the formally defined semantics of VVSL corresponds to the intended meaning;

• the formally defined semantics of VVSL is inaccessible to most people for which it was primarily meant (e.g. writers of informal introductions or reference manuals for VVSL, builders of tools for VVSL and specifiers finding ambiguities and incompletenesses in their introduction or reference manual).

In order to overcome these disadvantages~ VVSL is also provided with an equivalent semantics by defining an interpretation in roughly the 'nucleus ~ of COLD-K, which consists of MPL~, Description Algebra and ),r-calculus, in a follow-up project. Flat VVSL has now been provided with a well-defined logical semantics by defining an interpretation in the logic MPLw. It seems accessible to a much larger public; the only prerequisite is familiarity with classical first order logic. The impression is that the interpretation in MPLw is better suited to all people for which a formal definition of VVSL is meant. For the new semantics of the structuring sublanguage~ the prerequisites are familiarity with classical tambda calculus and Description Algebra or a similar model of specification modules (e.g. the models presented in [Wir86] or [BHK86]). It is not clear whether this alternative approach would have been usable for the VIP project; in particular it is doubtful whether a well-defined semantics would have been available in time.

The situation faced by the VIP project is in no way unique. In such situations~ a relatively general seman~c framework for specification languages is clearly missing. The framework should consist of a few orthogonal elements, based on assumptions which are generally met by specification languages, and some rules for composing the semantic basis for a particular specification language from instantiations of these elements. The elements of the general framework must be rather elementary to be usable in such a framework, but preferably not more elementary than strictly necessary. COLD-K is not such a general framework. Orthogonality is present, but it is far from elementary enough. The reason why its use for VVSL was no failure, should not be sought in its generality (illustrated in Section 9). The nucleus of COLD-K may be a first approximation. Orthogonality and genericity is present, but it may be partly too elementary.

Acknowledgements

The author is grateful to the METEOR project for the invitation to present this work at their final workshop. Thanks go to L.M.G. Feijs and H.B.M. Jonkers, both of Philips P~esearch Laboratories Eindhoven, and G.t~. Renardel de Lavalette of the University of Utrecht for enthusiastic help on COLD-related matters. Thanks also to J.A. Bergstra of the University of Amsterdam for the suggestion to devise a specification language which combines a language for structured VDM specifications with a language of temporal logic and to support it with a well-defined semantics by translation to COLD-K.

100

R e f e r e n c e s

[Bea88] S. Bear. Structuring for the VDM specification language. In R. Bloomfield, L. Marshall, and R. Jones, editors, VDM '88, pages 2-25. Springer Verlag, LNCS 328, 1988.

[BG80] R.M. Burstall and J.A. Goguen. The semantics of Clear, a specification language. In D. Bj~rner~ editor, Abstract Software Specifications, pages 292-332. Springer Verlag, LNCS 86, 1980.

[BHK86] J.A. Bergstra, J. Heering, and P. Klint. Module algebra. Report CS-R8617, Centre for Math- ematics and Computer Science, Amsterdam, 1986. l~evised version to appear in Journal of the ACM.

[BJ82] D. Bjcrner and C.B. Jones. Formal Specification and Software Development. Prentice~Hall, 1982.

[BK85] H. Barringer and It. Kuiper. Hierarchical development of concurrent systems in a temporal logic framework. In S.D. Brookes, A.W. Roscoe, and G. Winskel, editors, Seminar on Concurrency, pages 35-61. Springer Verlag, LNCS 197, 1985.

[BKP84] H. Barringer, R. Kuiper, and A. Pnueli. Now you may compose temporal logic specifications. In Proceedings of the 16th A CM Symposium on the Theory of Computing, pages 51-63. Association of Computing Machinery, 1984.

[BKP86] H. Barringer, R. Kuiper, and A. Pnueli. A really abstract concurrent model and its temporal logic. In Proceedings of the 13th A CM Symposium on the Principles of Programming Languages, pages 173-183. Association of Computing Machinery, 1986.

[Bli88] A. Blikle. Three-valued predicates for software specification and validation. In R. Bloomfield, L. Marshall, and R. Jones, editors, VDM '88, pages 243-266. Springer Verlag, LNCS 328, 1988.

IBM88] J. Bruijning and C.A. Middelburg. VDM extensions: Final report. Report VIP.T.E.4.3, VIP, December 1988. Distributed by PTT Research Neher Laboratories and Pra~xis Systems.

[Che86] J.H. Cheng. A logic for partial functions. Technical t~port Series UMCS-86-7-1, University of Manchester, Department of Computer Science, 1986.

[EH86] E.A. Emerson and J.Y. Halpern. ~'Sometimes" and ~not never" revisited: On branching versus linear time temporal logic. Journal of the ACM, 33(!):151-178, 1986.

[EM85] H. Ehrig and B. Mahr. Fundamentals of Algebraic Specification I: Equations and Initial Semantics. Springer Verlag, EATCS Monograph, 1985.

[Fei89] L.M.G. Feijs. The calculus hr. In M. Wirsing and J.A. Bergstra, editors, Algebraic Methods: Theory, Tools and Applications, pages 307-328. Springer Verlag, LNCS 394, 1989.

[Fis87] M. Fisher. Temporal logics for abstract semantics. Technical Report Series UMCS-87-12-1, Uni- versity of Manchester, Department of Computer Science, 1987.

[FJg0] J.S. Fitzgerald and C.B. Jones. Modularizing the formal description of a database system. Tech- nical Report Series UMCS-90-1-1, University of Manchester, Department of Computer Science, 1990.

[FJKR87] L.M.G. Feijs, tt.B.M. Jonkers, C.P.J. Koymans, and G.R. Renardel de Lavalette. Formal definition of the design language COLD-K. Technical Report METEOR/tT/PRLE/7, METEOR, 1987.

[Gtt86] J.V. Guttag and J.J. ttorning. Report on the Larch shared language. Science of Computer Programming, 6:103-134, 1986.

[Har84] D. tIarel. Dynamic logic. In D.Gabbay and F. Guenther, editors, Handbook of Philosophical Logic, Volume II, chapter II.10. D. l~idel Publishing Company, 1984.

101

[HK82]

[~M87]

[Jon83]

[Jon86]

[Jon89a]

[Jon89b]

[KR89]

[Lar89]

[LPZ85]

[Mid87]

[MidS9a]

[M~dSOb]

[MJd89e]

[Mid89d]

[MidgO]

[Mon85]

[MP89]

[o~88]

[PCTS6]

[Ren89]

[SanS4]

D. Harel and D. Kozen. Process logic: Expressiveness, decidability, completeness. Journal of Computer and System Sciences, 25:144-170, 1982.

R. Hale and B. Moskowski. Parallel programming in temporal logic. In J.W. de Bakker, A.J. Nijman, and P.C. Treleaven, editors, Proceedings PARLE, Volume II, pages 277-296. Springer Verlag, LNCS 259, 1987.

C.B. Jones. Specification and design of (parallel) programs. In R.E.A. Mason, editor, IFIP'83, pages 321-332. North-Holland, 1983.

C.B. Jones. Systematic Software Development Using VDM. Prentice-Hall, 1986.

H.B.M. Jonkers. Description algebra. In M. Wirsing and J.A. Bergstra, editors, Algebraic Methods: Theory, Tools and Applications, pages 283-395. Springer Verlag, LNCS 394, 1989.

tt.B.M. Jonkers. An introduction to COLD-K. In M. Wirsing and J.A. Bergstra, editors, Algebraic Methods: Theory, Tools and Applications, pages 139-205. Springer Verlag, LNCS 394, 1989.

C.P.J. Koymans and G.I~. Renardel de Lavalette. The logic MPLw. In M. Wirsing and J.A. Bergstra, editors, Algebraic Methods: Theory, Tools and Applications, pages 247-282. Springer Verlag, LNCS 394, 1989.

P.G. Larsen. The dynamic semantics of the BSI/VDM specification language. Technical report, Technical University of Denmark, October 1989.

O. Lichtenstein, A. Pnueli, and L. Zuck. The glory of the past. In R. Parikh, editor, Proceedings Logics of Programs 1985, pages 196-218. Springer Verlag, LNCS 193, 1985.

C.A. Middelburg. Syntax and semantics of VVSL. Working Paper VIP.T.D.KM9, VIP, October 1987.

C.A. Middelburg. Formalization of an abstract interface to a concurrent access handler using VVSL. Report 572 RNL/89, PTT Research Neher Laboratories, July 1989.

C.A. Middelburg. Formalization of RDM concepts and an abstract RDBMS interface using VVSL. Report 290 RNL/89, PTT Research Neher Laboratories, May 1989.

C.A. Middelburg. Logical semantics of flat VVSL. Report 954 RNL/89, PTT Research Neher Laboratories, December 1989.

C.A. Middelburg. VVSL: A language for structured VDM specifications. Formal Aspects of Computing, 1(1):115-135, 1989.

C.A. Middelburg. Semantics of VVSL's structuring language. Report 329 RNL/90, PTT Research Neher Laboratories, May 1990.

B.Q. Monahan. A semantic definition of the STC VDM reference language. Technical report, STC IDEC Ltd, 1985.

Z. Manna and A. Pnueli. The anchored version of the temporal framework. In J.W. de Bakker, W.-P. de Roever, and G. Rozenberg, editors, Linear Time, Branching Time and Partial Order in Logics and Models for Concurrency, pages 201-284. Springer Verlag, LNCS 354, 1989.

H.E. Oliver. Formal Specification Methods for Reusable Software Components. PhD thesis, Uni- versity College of Wales, Aberystwyth, 1988.

ESPRIT. PCTE Functional Specifications, 4th edition, June 1986.

G.R. Renardel de Lavalette. Modularisation, parameterisation, interpolation. Journal of Infor- mation Processing and Cybernetics EIK, 25:283-292, 1989.

D.T. Sannella. A set-theoretic semantics for Clear. Acta Informatica, 21:443-472, 1984.

[Spi88]

[STS~]

[ST88]

[StaS8]

[VDM88]

[VlP88a]

[VIPS8b l

[Win87]

[Wir86]

102

J.M. Spivey. Understanding Z. Cambridge University Press, Cambridge Tracts in Theoretical Computer Science 3, 1988.

D. Sannella and A. Tarlecki. Building specifications in an arbitrary institution. In G. Kahn, D.B. MacQueen, and G. Plotkin, editors, Proceedings Symposium on Semantics of Data Types, pages 337-356. Springer Verlag, LNCS 173, 1985.

D. Sannella and A. Tarlecki. Towards formal development of programs from algebraic specifications: Implementations revisited. Acta Informatica, 25:233-281, 1988.

E.W. Stark. Proving entailment between conceptual state specifications. Theoretical Computer Science, 56:135-154, 1988.

BSI IST/5/50, Document N-40. VDM Specification Language Proto-Standard, July 1988. Draft.

VIP Project Team. Kernel interface: Final specification. Report VIP.T.E.8.2, VIP, December 1988. Distributed by Praxis Systems.

VIP Project Team. Man machine interface: Final specification. Report VIP.T.E.8.3, VIP, De- cember 1988. Distributed by Praxis Systems.

J.M. Wing. Writing Larch interface language specifications. A C M Transactions on Programming Languages and Systems, 9(1):1-24, 1987.

M. Wirsing. Structured algebraic specifications: A kernel language. Theoretical Computer Science, 42(2):123-249, 1986.

Appendix

S h o r t S u r v e y of C u r r e n t T r e n d s in M o d u l a r S t r u c t u r i n g

A common assumption in most current theoretical work on modularization and parameterization, e.g. in [ST85], [BHK86], [Ren89], and [Jon89a, Fei89], is that a specification language has building blocks of structured specifications, which correspond to theory presentations in the language of an underlying logic. This assumption is trivially met by most existing languages for structured property-oriented specifications like Clear [BGS0], the Larch Shared Language [GH86] and ASL [Wit86]. It is also met by existing languages for structured model-oriented (including state-based) specifications like Z [Spi88], the Larch/CLU Interface Language [Win87] and COLD-K [FJKR87, Part II].

Various potential meanings of a specification are considered. The most interesting and important ones are theory presentations, theories and model classes. Usually, language constructs for building large structured specifications from smaller ones, correspond to operations on theory presentations, theories or model classes. In [Wit86], which contains a definition and an analysis of ASL, is shown that a model based on theory presentations is very useful and that there is a strong connection between this model and the model based on model classes which is regarded as the standard model of ASL. In [FJKR87], the modularlzation constructs of COLD-K are given a meaning using a model based on theory presentation, called Class Algebra. In [Spi88], the modularization constructs of Z are given a meaning using a model based on model classes. In [BG80], the modularization constructs of Clear are given a meaning using a model based on theories. The Larch Shared Language is an exceptional specification language in this respect. In [Gtt86], its modularization constructs correspond to purely syntactic manipulations on specification texts.

In the models of modular specifications of Clear and COLD-K, the origins of names are taken into account in the treatment of name clashes in the composition of theories and theory presentations respectively. Usually it is assumed that in case of name dashes the same name is intended to denote the same thing.

The primitive operations of models of modular specifications which are proposed in connection with particular work on modularization, generally include operations for combination, hiding and renaming. Hiding and renaming are sometimes combined and called deriving. Combination provides for an import mechanism

103

and hiding provides for an export mechanism. Renaming provides for control of name clashes. Roughly speaking, there are no essential differences between the various models after restriction to these operations, except the minor differences which are inherent in the choice to base the model on theory presentations, theories or model classes, and the differences due to the different treatment of name dashes. Major differences, if present, are in additional operations, e.g. operations for restrictions on allowable models that are not expressible in the underlying logic. These remarks do not apply to work in which modularization is tightly coupled to term-generated or initial models by restricting to them in advance, such as in the work presented in [EM85]. If the origin of names is not taken into account in the treatment of name clashes, then the operations for combination, hiding and renaming turn out to be mathematically simple. Otherwise, some complexity seems unavoidable. [BHKg6] gives equational axioms of models of modular specifications, called module algebras, which provide operations for combination~ hiding and renaming.

Semantically, parameterized specifications are usually viewed as functions on theory presentations, theories or model classes. Syntactically, a version of typed lambda calculus with parameter restrJction~ llke the ,~Tr- calculus introduced in [Fei89], is often used for parameterization of structured specifications and application of parameterized specifications. A lambda calculus based approach to parameterization is also pursued in the theoretical work presented in [ST88] and [Ren89]. Such an approach is also used to provide for a parameterization mechanism in various existing languages for structured specifications including ASL [Wir86] and COLD-K [FJKP~87]. In [BGg0] a different approach is used for Clear; parameterized specifications are viewed as morphisms in a category (of 'based theories'). However, the definition of application in [San84] (giving a set theoretic semantics for Clear) seems close to a construction that simulates a variant of fl-conversion.

A meta-environment for generating

programming environments

P. Klint Department of Software Technology, Centre for Mathematics and Computer Science

P.O. Box 4079, 1009 AB Amsterdam, The Netherlands and

Programming Research Group, University of Amsterdam P,O. Box 41882, 1009 DB Amsterdam. The Netherlands

Over the last decade, considerable progress has been made in solving the problem of auto- marie generation of programming environments, given a formal definition of some programming (or specification) language. In most cases, research has focussed on the functionality nnd efficiency of the generated environments but only marginal attention has been paid to the development process of formal language definitions itself. Assuming that the overall quality of generated environments will be satisfactory within a few years, the development costs of formal language definitions will very soon become the crucial factor determining the ultimate success and acceptance of environment generators. In this paper we describe the design and implementation of a meta-environment (a development environment for formal language definitions) based on the formalism ASF+SDF. This meta-environment is ettrrently being implemented as part of the Centaur system and is obtained by applying environment generation techniques to the language definition formalism itself. Some of the issues addressed are the interactive editing of modular language definitions with immediate pro- eessing of edit operations, treatment of formalisms with user-def'med syntax, and new modular program generation techniques.

1989 CR Categories: D.2.1 [Software Engineering]: Requirements/Specifications--Lan- guages; D.2.6. [Software Engineering]: Programming Environments; D.3.1 [Programming Languages]: Formal Definitions and Theory---Syntax, Semantics; D.3.4 [Programming Lan. guages]: Processors.

1985 Mathematics Subject Classification: 68N15 [Software]: Programming Languages; 68N20 [Software]: Compilers and generators.

Key Words & Phrases: concrete and abstract syntax, user-definable syntax, programming language semantics, algebraic specification, language definition formalism, programming environment generation, meta-environment, modular implementation techniques.

Note: Partial support received from the European Communities under ESPRIT project 348 (Generation of Interactive Programming Environments---GIPE).

1. INTRODUCTION Over the last decade, several research projects have focussed on the automatic generation of programming environments given a formal specification of a desired language (for instance, Mentor [DHKL84], PSG [BS86], Synthesizer Generator [RT89], Gandalf [HN86], GIPE [HKKL85], Genesis [GENESIS87], and Graspin [ES88]). A programming environment is here understood as a coherent set of interactive tools such as syntax-directed editors, debug- gers, interpreters, code generators, and pretty printers to be used during the construction of

106

texts in the desired language. This paradigm has been applied to generate environments for languages in different areas such as programming, formal specification, text formatting, and proof construction. All these projects are based on the assumption that major parts of the generated environment are language independent and that all language-dependent parts can be derived from a given formal specification. Various problems have been studied:

• integration of text-oriented editing and syntax-oriented editing; • automatic generation of incremental tools from non-incremental specifications; • integrated language definition formalisms versus non-integrated subformalisms; • generation of interpreters; • fixed versus user-definable user-interfaces; • fixed versus user-definable logic in language deffmition formalisms; • descriptive power of the language definition formalism (polymorphic type systems,

concurrency, etc.). One can observe that systems with fixed, build-in, solutions for most of these problems are very easy to use in their envisaged application area, but that it is difficuIt, or even impossible, to use them in new areas. Therefore, one should strive for as much generality or flexibiIity as possible. It may, of course, turn out that very general systems are difficult to use in every application area.

The Centaur system [BCDIKLP88] is an outcome of the GIPE project and can be characterized as a set of generic components to be used for building environment generators. The generic components support, among others, operations for:

• manipulating abstract syntax trees (create, change, search, store, retrieve, annotate with error messages, record changes);

• creating graphical objects and user-interfaces. The kernel thus provides a number of useful datatypes but does not make any assumptions about, for instance, the logic underlying the language definition formalism. This generality is achieved by permitting a simple interface between logical engines and the kernel. Note that these logical engines are not generated from specifications but that they are implemented separately.

The kernel has already been extended with compilers for various language definition subformalisms such as TYPOL [Des84, Kah87], SDF [HK89a], METAL [KLMM83], and interactive tools (a tool for controlling the execution of TYPOL specifications; GSE, a generic syntax-directed editor). The system thus rather resembles an extendible toolkit than a closed system.

The current Centaur system already gives some support for the interactive development of language definitions (e.g., the interactive editing and debugging of TYPOL specifications), but major efforts are still needed to obtain a true interactive development environment for language definitions.

In this paper, we describe our efforts towards an integrated language definition formalism (ASF+SDF) and we will show how an interactive development environment for ASF+SDF specifications can be constructed on top of the current Centaur system. This leads to a recta- environment in which language definitions can be edited, checked and compiled in a similar manner as programs can be manipulated in a generated environment O.e. an environment obtained by compiling a language def'mition). The main topics to be discussed are:

• interactive editing of modular specifications with immediate processing of the edit operations (this requires in our case, for instance, incremental typechecking, incremental scanner and parser generation, and incremental compilation of algebraic specifications);

• treatment of formalisms with variable (i.e., not fixed) syntax. The plan of the paper is as follows. In Section 2, we give an overview of the features of the formalism ASF+SDF that have influenced the design of the recta-environment. In Section 3, the global organization of the ASF+SDF recta-environment is presented. Section 4 addresses the issue of defining the logical syntax of modules and Section 5 gives a look inside the generic syntax-directed editor that forms the essential building block in our proposal. After these preparations, the actual construction of the ASF+SDF meta-environment is described in

107

I. 2.

3. 4. 5. 6. 7.

8.

9.

I0.

ii.

12.

13.

14.

module Booleans exports

sorts BOOL

lexical syntax [ \t\n] -> LAYOUT

context-free syntax true -> BOOL false -> BOOL

BOOL "A" B00L -> BOOL

equations

[I] true Atrue = true

[2] true A false = false

[3] false Atrue = false

[4] false Afalse = false

15. module Naturals

16. exports 17. sorts NAT 18. context-free syntax 19. 0 -> NAT

20. succ NAT -> NAT 21. NAT "<" NAT -> BOOL

22. imports Booleans 23. variables 24. N -> NAT 25. M -> NAT 26. equations 27. [I] 0 < 0 = false 28. [2] succ N < 0 = false 29. [3] 0 <succ N = true

30. [4] succ N < succ M = N < M

{ left }

Figure 1. An ASF+SDF specification of Booleans and Naturals

Section 6. Implementation techniques needed for the system are described in Section 7 and a discussion in Section 8 concludes the paper.

2. A S F + S D F The global design of the meta-environment for ASF+SDF to be discussed in the next section can, to a large extent, be used for a variety of specification formalisms. However, apart from certain assumptions about specifications and modules in specifications (e.g., imports, parameterization, renaming, form of conditional equations), there is one specific feature that has largely determined our design: modules cannot only introduce new functions and define their semantics but they can introduce new syntactic notations for these functions as well. The implications of this feature are far reaching, since one has to accommodate the (syntax-directed) editing of specifications with a variable syntax.

Although, a detailed understanding of the formalism ASF+SDF is not necessary for understanding the remainder of this paper, a brief sketch of the formalism may help the reader to see the benefits (and associated implementation problems) of user-defined syntax. ASF+SDF is the result of the marriage of the two formalisms ASF (Algebraic Specification

Formalism) and SDF (Syntax Definition Formalism). ASF [BI-IK89] is based on the notion of a module consisting of a signature (for defining the type and the arity of functions) and conditional equations (for defining their semantics). Modules can be imported in other modules and they can be parameterized. SDF [HK89a, HHKR89] allows the simultaneous defini-

108

I. 2. 3. 4. 5. 6, 7. 8

9 i0 ii 12 13 14

15.

16.

17.

18. 19. 20. 21. 22. 23.

24. 25. 26. 27. 28. 29. 30. 31 . 32. 33. 34.

35.

36. [3]

module Identifiers exports

sorts ID, ID-LIST lexical syntax

[a-z] [a-z0-9]* -> ID [ \t\n] -> LAYOUT

context-free syntax "{" ID* "} " -> ID-LIST

ID "E" ID-LIST -> BOOL

imports Booleans -- as defined in Figure 1 variables Id [']* -> ID Ids -> ID*

equations

[I] Id E {} = false

[2] Id E {Id Ids} = true

Id ~ Id' [3]

Id e {Id' Ids} = Id 6 {Ids}

module L-syntax exports

sorts L-PROGRAM context-free syntax def ID-LIST in ID-LIST -> L-PROGRAM

imports Identifiers

module L-tc exports

context-free syntax tc "[" L-PROGRAM "]" -> BOOL

imports L-syntax variables

Id -> ID Ids -> ID* Defs -> ID*

equations [i] tc [ def {Defs} in {} ] = true

Id E Defs = true, tc [ def {Defs} in {Ids} ] [2]

tc[ def {Defs} in {Id Ids} ] = true

Id E Defs = false

tc[ def {Defs} in {Id Ids} ] = false

= true

Figure 2. A simple language and its typechecker

tion of concrete (i.e., lexical and context-free) and abstract syntax and implicitly defines a translation from string (via the parse tree associated with that string) to abstract syntax tree. The main idea of ASF+SDF [HHKR89, HK89b, Hen89, vdM88] is to identify the terms defined by the signature in an ASF specification and the abstract syntax trees defined by an SDF specification, thus obtaining a standard mapping between strings and terms. This creates the possibility to associate semantics with (the tree representation of) strings and to introduce user-defined notation in specifications.

Two (trivial) examples may help to clarify this general description. The example in Figure 1 shows a definition of two modules. The module Booleans defines the sort BOOL, the constants t rue and fa l se , and the left-associative operator ^. The equations define ^ as the or-

109

dinary and operator on Boolean values. The module Naturals defines the sort NAT, the constant 0, the successor function suce, and the infix operator <. The equations define < as the ordinary less than operator on natural numbers. This example shows how new syntax rules are introduced in a module (appearing under the heading c o n t e x t - f r e e syntax) and how they can be used in the equations. The result is that, for instance, the equation in fine 11 can only be parsed given the syntax definition in line 9. Since arbitrary context-free grammars can be defined in this way, we cannot give a fixed grammar for each module. In- stead, all syntax rules defined in a module (together with all syntax rules defined in imported modules) contribute to the grammar of that particular module (also see Section 4).

Being interested in formal language definitions, we give as a second example a trivial typechecking problem. Consider the language L of programs of the form

clef { a list o f identifiers } in { a list o f identifiers )

satisfying the constraint that each identifier appearing in the second list appears in the first list as well. A definition of L is given in Figure 2 and consists of three modules. The module I d e n t i f i e r s defines the sorts ID (identifiers) and ID-LIST (fiStS of identifiers) together with a membership function ~. The sort L-PROGRm4 introduced in module L-syntax defines all syntactically correct programs in language L. In module z-to, we define the typechecldng function tc [ ] on L-programs that checks the constraint mentioned above.

The points to be emphasized in these examples are: • a formal language definition consists of a sequence of modules; • a module may import other modules from the language definition; ° each module may define syntax rules as well as semantic rules; • the notation used in the semantic rules depends on the definition of the syntax rules.

3. GLOBAL ORGANIZATION OF A META-ENVIRONMENT FOR ASF+SDF

3.1. General architecture Figure 3 shows the overall organization of the system. First of all, we make a distinction between the meta-environment (i.e., the interactive development environment for constructing language definitions and for generating and testing particular programming environments) and a generated environment (i.e., an environment for constructing programs in some programming language L, obtained by compiling a language definition for L in the meta-environment). In the meta-environment one can distinguish

• a language definition (in ASF+SDF) consisting of a sequence of modules; • the environment generator itself (internally consisting of three subcomponents to be

discussed below). The output of the environment generator is used as parameter to a generic building block that can be used to construct environments. One language definition can thus result in more than one generated environment. The basic building block is called Generic Syntax~irected F~tor (GSE) and is an abstraction

of a syntax-directed editor, combined with computed operations on the edited program such as typechecking and evaluation. The main inputs to the Generic Syntax-directed Editor are:

• a program text P; • the name of the module that defines the syntax of P.

Later on, we will see how external processors (e.g., typecheckers, evaluators) can be attached to GSE, but now we will fast motivate this architecture and then discuss some details of the environment generator itself. A detailed discussion of GSE is postponed to Section 5.

Our point of departure is a specification formalism (ASF) in which the operations for module composition (import, export, renaming, parameter binding) are defined in terms of textual expansion of the specification. Each composition operation can thus be removed from the specification and, ultimately, we can associate with each module a new module that does not contain any module composition operations (its so-called normalform). As previous research

110

Language Definition

m ~ Environment Generator

, / Nn

Name of

Syntax

Meta-environment

Generated Environment

Program (text)

~r

ii!i!iii!iii!iii!i!iii!iiiiiii!;iiii!iiiiiiiiiiiiii! iiiiiiiiiiiiiiiiiiii!iiiiiiiiiiiiiii!iiiiiiiii!!ii!iiii!i!i

Figure 3. Global organization

has shown [Hen88], this conceptually simple method is inadequate as a basis for the implementation of specifications since the actual copying of modules is not only expensive (both in compilation time and in size of the generated code) but it is also difficult to extent to separate compilation of modules.

We propose the following, alternative, implementation model. Each module in the specification contains a number of "rules" (e.g., declarations, grammar rules, conditional equations). Instead of constructing the normal form for each module in the specification, we col- lect all rules from all modules in a single, unstructured, global set of rules. All information concerning the modular structure of the specification is thus lost. We only require a mechanism to enable or disable rules from the global set. Instead of constructing the normal form for each module, one only has to calculate which rules in the global set have to be enabled to obtain the same effect as the desired normal form. After selecting certain n~es from the global set, these can be used immediately (e.g., for parsing input sentences according to the selected set of grammar rules, or for rewriting an input term according to the selected set of conditional equations). The success of this implementation model is determined by the efficiency of the following operations:

• calculation of the set of rules corresponding to a normal form; • enabling/disabling rules in the global set;

111

module M1 begin a, b

end M1

module M2 begin imports M1

c end M2

module M3 begin imports MI,

M2 renamed by R1

d end M3

module M4 begin imports M!,

M2, M3 renamed by R2

end M4

Figure 4. A modular specification

module M1 begin

a, b end M1

module M2 begin

a, br c

end M2

module M3 begin a, b, aRlt bRlr cRlr d

end M3

module M4 begin a, b, c, aR2 r b R2, a RIR2, bRIR2, ' c RIR2, d R2

end M4

Figure 5. Normal forms of the modules in Figure 4

a a R1 a R2 a RIR2 b b R1 b R2 b RIR2 c c R1 c RIR2 d d R2

M1 x o o o x o o o o o o o o M2 x o o o x o o o x o o o o

M3 x x o o x x o o o x o x o M4 x 0 x x x 0 x x x 0 x 0 x

Figure 6. Global set of rules and selections corresponding to example in Figure 4

• selecting parts of the implementation of the rules in the global set for a given set of enabled/disabled rules;

• modifying the global set of rules (and the corresponding implementation) to reflect editing operations on the specification.

The viability of this implementation model is further discussed in Section 7.

112

Lexicat rules

Rewrite rules

Context-free rules

a

b C

d

f g

Figure 7.

Consider, in Figure 4, a sequence of named modules which may contain names of other modules to be imported as well as a number of unspecified "rules" which we denote by lower case letters. An imported module may optionally be renamed before it is imported. The corresponding normal forms are shown in Figure 5 and the corresponding global set of rules in Figure 6. The global set of rules contains the original rules as they appear in the specification together with renamed versions of the rules as needed for the normalization of all the modules in the specification. As an optimization, one could remove from the global set those renamed rules that are identical to the original rule, i.e. when the original rule is ::or affected by the renaming.

Returning to the global architecture as shown in Figure 3, one can distinguish three components in the environment generator that maintain information at a global level:

1. The Module Manager (MM) administrates the overall modular smacture of the specification. This amounts to incrementally maintaining the import relations between modules and keeping track of definition and use of individual rules.

2. The Syntax Manager (SM) administrates the (lexical and context-free) functions, as well as the declarations of priorities and variables defined in each module. The Syntax Manager also creates and updates the scanners and parsers derived from all modules.

3. The Equation Manager (EQM) administrates the equations defined in each module together with their compiled form.

The general principle is that the Module Manager manages all modular information and that the Syntax Manager and the Equation Manager have only access to those parts of this infor- marion that they need to carry out their respective tasks.

Applying this organization to the example given earlier in Figure 4, we obtain the situation as depicted in Figure 7. The Module Manager passes all information related to syntactic issues to the Syntax Manager, which, in its turn maintains two global sets of rules: lexical rules and context-free rules. All information related to conditional equations is passed to the Equa- tion Manager, which maintains one global set of rewrite rules.

118

3.2. Major components Next we give a description of all operations provided by the Syntax Manager, the Equation Manager, and the Module Manager.

3.2.1. The Module manager (MM) The Module Manager provides operations for adding and deleting modules and pans of

modules as well as for parsing and evaluating strings: add, de! :

Add/delete a module to/from the specification; add/delete a sort declaration, lexical function definition, context-free function definition, priority declaration, import, variable declaration, or equation to/from a module.

select: Select a module as current module.

parse: Parse a string in the context of the current module; the result is a term.

eval: Evaluate a term in the context of the current module.

Many of these operations depend on the corresponding operations defined in, respectively, the Syntax Manager and the Equation Manager (see below).

3.2.2. The Syntax Manager (SM) The Syntax Manager provides operations for adding and deleting parts of the SDF-part of a specification, for selecting a module, and for parsing strings: add, del:

Add/delete lexical function definition, context-free function definition, variable declarations, priority declarations, renamings, imports, or equations to/from a given module.

select:

Select a module as current module. All SDF functions (and their renamed versions) belonging to the normal form of the selected module are used to determine the grammar defined by the current module and to select those parts of the generated scanner and parser accepting that grammar.

parse:

Parse a string according to the currently selected module.

3.2.3. The Equation Manager (EQM) The Equation Manager provides operations for adding and deleting equations to/from a module, for selecting a module, and for evaluating terms: add, del:

Add/delete signature information or an equation to/for a given module. select:

Select a module as current module. All equations (and their renamings) belonging to the normal form of the selected module are used to determine the term rewriting system defined by the current module and to select those pans of the compiled rewrite system corresponding to the selected rules.

eval:

Rewrite a term according to the currently selected module.

4. THE REPRESENTATION OF LOGICAL SYNTAX When constructing the recta-environment based on ASF+SDF, we are confronted with the question of how the syntax of the logical pan of specifications (in our case the equations section) is represented. Defining the logical syntax in the form of an ordinary module is not only elegant but it is efficient in terms of implementation effort as welt. The logical syntax should be explicit and localized in a single module (as opposed to, for instance, distributed in the

114

implementation of the Module Manager). In this way, it will be easy to change the logical syntax. There are two possible approaches:

• Use a general but too liberal grammar to describe the form of equations. In its simplest form, this grammar would consist of a single rule

<equation> : := <term> "=" <term>, where <term> describes all well-formed terms. Unfortunately, this rule permits equations in which the sorts of both terms are unequal. Therefore, a separate type checking phase is necessary to detect these cases.

• Reject type incorrect equations already during parsing by using specific syntax rules for equations for all sorts sl ..... s~ declared in the specification. This grammar has the form:

<equation> ::= <SI> "=" <SI> I ... I<Sn> "=" <Sn> We will now consider the second alternative in more detail.

4.1. Typechecking equations by means ofa specialhed equation grammar Consider an ASF+SDF specification consisting of the modules M1,..., Mn (see Figure 8). This specification is extended in the following way to define the logical syntax. First, the modules Equations and Equation are added. The former introduces a sort for an individual equation and for a complete equations section. The '_atter is a parameterized moduledefining the syntax of a single equation for an arbitrary sort (we only discuss a simplified version of the definition of unconditional equations; conditional equations can be defined following a similar pattern). The definitions are:

module Equations exports

SOrtS EQ, EQ-SECTION context-free syntax EQ* -> EQ-SECTION

module Equation parameters

Sort sorts SORT

exports context-free syntax SORT "=" SORT -> EQ

imports Equations

Next, we define for each module M i in the specification a module EQ-M i that instantiates Equation for each sort Sl ..... Sk declared in Mi, and imports the "equation-version" of each module N1 ..... Nm imported by Mi:

module EQ-M i imports Equation { Sort bound by [SORT -> SI] to Mi} . . o

Equation { Sort bound by [SORT -> Ski to Mi} EQ-N 1 , , °

EQ-N m

Parsing an equation in module M i can now be done in the context of EQ-M i.

4.2. Example of a specialized equation grammar Consider the specification of SooXeans and N a t u r a l s given earlier in Figure 1 (Section 2). Using the scheme described in the previous paragraph, this specification will be extended with the following modules (apart from the modules Equation and Equations given earlier):

115

Fixed modules defining logical syntax

The modules in the specification (defined by user)

I Equations 1

Equation

Modules defining the contributions to the logical syntax for each module (automatically generated)

Figure 8. Defmition of logical syntax

module EQ-Booleans imports Equation { Sort bound by [SORT -> BOOL] to Booleans }

module EQ-Naturals imports Equation { Sort bound by [SORT-> NAT] to Naturals } EQ-Booleans

An equation like

0 < succ 0 = succ 0 < succ succ 0 that could legally appear inside module Naturals, can now be parsed using EQ-Natura!s.

5. LOOKING INSIDE THE GENERIC SYNTAX-DIRECTED EDITOR The Generic Syntax-directed Editor (GSE) provides the following functionalities:

• Syntax-directed editing of strings (programs) in a given language L. • Informing the world outside the editor about changes made during editing. • Execution of operations on the L-program in the editor as defined in the language defi-

nition (e.g. typechecking, evaluation, pretty printing). • Display the output of these operations. • Adjust the internal state of the editor after a modification to the syntax of language L.

We will now briefly discuss each of them.

5.1. Syntax.directed editing As experience shows, the paradigm of pure syntax-directed editing does not lead to very convenient editors. In many cases, a user wants to perform editing operations that are rather

116

M!d I Sorts I I syntax-changed t

e Generic Syntax-directed Editor X

t ,'~

Commands Change

Figure 9. Generic Syntax-directed Editor (GSE) with its parameters

text-oriented than structure-oriented in nature. To overcome this problem, GSE aims at integrating text-oriented editing and slructure-oriented editing as smoothly as possible. By syntax-directed navigation (or by just pointing) the user can position a focus on a part of the program being edited. The contents of the focus can be modified by conventional text-editing operations. When the user wants to move the focus to another part of the program, its text is parsed and if syntax errors are found these should be corrected before the focus can be moved. See [DK89b, Log88] for a description of this editor and [DK89a] for a description of its implementation.

From the perspective of the meta-environment, the editing and parsing of programs can be implemented using the p a r s e function of the Module Manager. GSE should therefore know the name of the current module (language) to be used.

5.2. Management of changes During editing, changes are being made to the program being edited. It depends on the en- vironraent in which the editor is being used whether additional processing is required after a change. Assuming that the editor is parameterized with a function change, that communicates changes to the environment, there are two possibilities for choosing the granularity of the communication:

• The function change is called after each modification to the program. • The function change is only called after modifications of subtrees of certain sort(s).

The list of these sorts should be a parameter of the editor. In the first case, change has to infer whether additional actions are needed, while in the second case this can be done by the editor in a generic way. The precise form of this change information has not yet been determined. Two possible realizations of it are: (a) pointers to the old as well as to the modified subtree; (b) path expressions describing modifications as proposed in [CH89].

5.3. Attaching external processors to the editor The formal definition of a language may contain rules specifying certain operations on programs such as, for instance, typechecking and evaluation. After compilation of the specification this leads to a number of "processors" that can operate on programs. The question now arises how these external processors can be attached to a syntax-directed editor. The following points should be considered:

(a) Activation of the external processor. (b) Communication of information inside the editor to the external processor. (c) Communication of the output of the external processor to the editor.

Point (a) can be solved by giving a list of (command,function)-pairs as parameter to the editor. The commands (strings) are placed in the command menu of the editor and selection of a

117

Environment Generator

[ i~i#.:.:~i#':: ~':'~'~:~:'~'~ ............... ::::::::::::::::::::::::::::::::::::::::::::::::::::::

!~'~ ?.':~.~ ~:~, ~ ' ~ i : ~ . ~ : i : i ~ i ~ : ~ : : .:.~::~: • . :!:~:?~:$~:i:~:~ :.~:~:~:~.:,~:~,:::~.:::;','::~,: :;~,:+:.:. LanguageDefinmon i~i:!iilIi~liti/i ~4~..,.~.~!~i~ [ SM ~

~:~li~i~,~i~ ~,~;!:: ' .~:~:.~:::: ':" .<'.:::;'. ,:::~::~:::::::

~ i ~ ~ :{:~:::::.:5;~ ~::: "~:. ~-;~"~::".'~-:;f.~ =====================. <:.:~.::s~:¢~: :.:-~, 1::::I~1::~{~1:::::, ..:~/,-:.~:,x+

I

Program (text)

M

I Sortsi ti ie X

t

[ "execute" :

~M~I[I I syntax-changed

Generic Syntax-directed Editor

Commands Change

function(gse) {

Meta-environment


gse .MM. select (gse .Mod) , gse.MM.eval (gse.Term) }

Figure 10. A generated environment for evaluating terms

certain entry in that menu will result in a call to the associated function. All functions have as single argument the editor from which they are being called. Note that automatic activation of external processors (as, for instance, needed for incremental typechecking) can be implemented by means of the change function discussed in the previous section.

Point (b) is solved by providing operations on the editor that return (pans of)the internal state of the editor, such as, e.g., the current program, the current focus, etc.

Point (c) can only be solved when all external operations return their output in a fixed format. An obvious choice is a list of (error-message, subtrees)-pairs, to be interpreted as a list of all erroneous subtrees with their associated error-messages.

5.4. Syntax modifications After a modification of the syntax of the input language L of the editor, it should be verified that the current program in the editor is still a valid L-program. A naive implementation will completely (re)parse the program. This facility is needed in order to support editing in the meta-environment (see Section 6).

118

Language Definition

v


Program (text)

["check":

"execute":

]

L

J:l o

L I S°rtsl I syntax-changed

Generic Syntax-directed Editor

Commands I Change

function(gse) { gse.MM.select(L-TC),

function (gse) {

Meta-environment


gse.MM.eval (tc (gse.Term)) }, gse .MM~ select (L-EV), gse .MM. eval (eval (gse .Term) )

Figure 11. A generated environment for editing, typechecking and evaluating L programs

5.5. Major functions of GSE The above discussion can be summarized in the following list of operations provided by GSE (also see Figure 9): GSE:

Construct a new instance of GSE given: a Module Manager, a module name (def'ming the input syntax, i.e., the syntax of the texts to be edited), a list of sorts (defining the subtrees for which change should be called) a change function, a list of (command, function) pairs def'ming the communication with external

processors

119

FOCUS, Tree, MM, Mod: Return status information such as the current focus (Focus), the current program (Tree), the Module Manager used (MM), the module defining the input language (Nod), etc.

up, down, replace, search, ..., : Perform editing operations.

syntax-changed: Signal a modification of the input syntax and adjust the internal state of the editor ac- cordingly.

Typical examples of the use of GSE are shown in figures t0 and 11. In figure 10, the language definition consists of a single module M, and we construct an environment for editing and evaluating terms in M. The execute operation uses the Module Manager associated with this instance of the editor (gse. MS) and fkst selects the current input language of the editor (qse .nod) as current module and then evaluates the current term in the editor (qse. Term) in the context of that module using the eval function provided by the Module Manager.

In figure 11, the language definition consists of three modules: r.-SXN (defining the syntax of language L), L-TC (defining the typecheeldng of L programs; L-TC imports L-SXN), and L-EV (defining the evaluation of L programs; it also imports L-S'm). In this case, we construct an environment for editing, typechecldng and evaluating L programs. The commands check and oval are implemented using the functions t c and eval defined in, respectively, L-TC and L-EV.

6. EDITING IN THE IV[ETA-ENVIRONMENT How can we use generated editing environments to edit ASF+SDF specifications? To answer this question we have to define the complete syntax of ASF+SDF specifications. This can be done in the following way:

- To each specification we add, implicitly, a fixed module called SDF, which defines the syntax of the SDF part of each module.

• To each specification we add the modules Equations and Equation defining the logical syntax as described in Section 4.

• To each module M we add a module EQ-M, defining the contributions of module M to the logical syntax.

Editing a module in the specification now amounts to creating two editors: one for the SDF part of the module (GSE1) and one for the equations part (GSE2). "I~ais is shown in figure 12. Some comments on this figure are appropriate:

• The granularity of the processing of changes to the SDF part is determined by a list of sorts given to GSEZ. This list contains a sort name for each entity for which the Syntax Manager provides add/delete operations (e.g. lexical function definitions, priorities, etc.).

• The chanqe function associated with ssg l will use the Syntax Manager for actually performing the changes to the SDF part of a module. It will also call Gsg2. syntax- chanqed after each modification to the SDF part of the module.

• The list of sorts given to Gsg2 only contains the sort v Q, i.e. only changes at the level of complete equations are considered as change. This corresponds precisely to the add/delete operations provided by the Equation Manager.

• The change function associated with osg2 will use the Equation Manager for actually performing the changes to the equations apart of the module.

• We have left unspecified which operations are performed on the SDF part, respectively, the equations part of the module. Typical examples are: typechecldng and compiling.

120

i, , SDF i "l~ed m o d u l e s

!b,\

. . o

n %._ ~,

i III I

SDF


SDF part

Equations

Mi

;:;~:,;:;+;;/~:+:~,~:J:,:.S:+ w,e ~ ,:+:5::,., :,:.xo: /

Generated Modules

Mod I Sorts I MM I syntax-changed t

Generic Syntax-directed Editor ..._..~e

x GSE!

t, Commands Change

Meta-environment I I I IIIIII I


EQ-Mi [EQ] /

I od I ~o ts I ~ I syntax-c ange~ _ I~1 Gono=~c~yn~ax-d~=e~ed E~i~or

I~1 ....... , ' command+, I c ~ a n g e ,

Figure 12. Editing a specification module with the Generic Syntax-directed Editor

121

7. IMPLEMENTATION TECHNIQUES In Section 3.1. we have postulated the existence of an implementation model for modular specifications in which all "rules" appearing in modules are collected in one global set together with a mechanism to enable or disable individual rules from this set. Although a general framework for describing this method is still lacking, two experiments have been performed that demonstrate the feasibility of this approach.

One experiment [Kti89] concerns the case that the rules in each module are regular expressions to be compiled into a deterministic finite automaton. The key idea is to construct a single automaton for all regular expressions in all modules. The selection operation that enables or disables certain regular expressions, is implemented by enabling or disabling the corresponding transitions in the automaton. The resulting Modular Scanner Generator uses techniques for lazy and incremental program generation [HKR87a, HKR87b]: parts of the finite automaton are only constructed when they are needed and all parts not affected by the addition or deletion of a regular expression will be reused. In the same spirit, the enabling or disabling of transitions is only done when needed.

In the other experiment [Rek89] modular context-free grammars are studied and the "mles" to be considered are syntax rules. Key idea is, again, to construct a single parse table for all syntax rules in all modules and to implement the enabling or disabling of a syntax rule by enabling or disabling the corresponding transitions in the parse table. The resulting Modular Parser Generator also uses lazy and incremental techniques and extends the notion of incremental parser generation described in [HKR89].

Measurements show that in both cases the selection operation is very fast and that the overall performance of the generated programs (i.e., scanner and parser) is hardly influenced by the introduction of an enabling/disabling mechanism for individual rules. We may therefore conclude that an efficient implementation exists for our implementation model in at least these two specific cases.

In this context, no experience exists yet with the compilation of algebraic specifications to rewrite rules. One of the directions to investigate is to use finite automata for the matching of left-hand sides of rules and to apply similar techniques as in the case of modular scanner generation.

8. CONCLUDING REMARKS

8.1. Current state of implementation An implementation of the recta-environment for generating programming environments as presented in this paper is currently in progress. The Modular Scanner Generator and Modular Parser Generator discussed in the previous section are operational and are currently being incorporated in a first version of the Syntax Manager. A first, very inefficient, prototype of the Equation Manager exists, and work on an efficient implementation has just started. Imple- mentation of the Module Manager is in progress. A version of the Generic Syntax-directed Editor that supports syntax-directed editing as described in Section 5.1. is operational. How- ever, the operations needed for supporting the recta-environment still have to be implemented.

Some major implementation problems still have to be solved such as the best determination of changes (Section 5.2), the incremental maintenance of errors detected in specifications, and techniques for the implementation of the Equation Manager.

8.2. Discussion Although an evaluation of the proposed meta-environment has to await completion of its implementation as well as experience with its use, some remarks on the design are in order. One can foresee the following problems and open questions:

• The system as proposed here will be faced with a serious version management problem: after changing a language definition there may still be programs in use that conform to the old definition.

122

It is not yet clear whether the proposed implementation model will scale up to industrial size applications. Not much experience exists with the use of specification formalisms with user-defined syntax. In principle, freedom of notation seems to be a desirable property, but it may very well turn out that this freedom has to be controlled in some way for the sake of readability and reusability of specifications.

We see the following merits in the proposed system: * The generality of the syntax definition mechanism provided by ASF+SDF together with

the new, but well-understood, techniques used for their implementation form an improvement over the syntax definition facilities in comparable systems [FGJM85, Voi86].

* The use of two instances of GSE for editing languages definitions in the meta-environment is an interesting case of reusing existing components. As a result, both the meta-environment and generated environments will benefit from future improvements of GSE.

• The similarity between meta-environment and generated environments makes it possible that desirable features in the meta-environment can also be considered as new features in generated environments (and vice verse). This may lead to interesting generalizations.

ACKNOWLEDGEMENTS The ideas presented here have emerged from numerous discussions with the following persons: Hans van Dijk, Jan Heeling, Paul Hendriks, Wilco Koom, Emma van der Meulen, Jan Rekers, and Pum Wakers. Paul Hendriks and Emma van der Meulen Commented on a draft of this paper.

LITERATURE [BCDIKLP88]

[BI-tK89]

[BS86]

[CH89]

[DHKL84]

[DK89a]

[DK89b]

P. Borras, D. Clement, Th. Despeyroux, L Incerpi, G. Kahn, B. Lang & V. Pascual, "Centaur: the system", Proceedings of the ACM SIG- SOFT/SIGPLAN Conference on Practical Software Development Envi- ronments, 1988, pp. 14-24. J.A. Bergstra, J. Heering & P. Klint (eds.), Algebraic Specification, ACM Press in co-operation with Addison-Wesley, 1989. R. Bahlke & G. Snelting, "The PSG system: from formal language definitions to interactive programming environments", Transactions on Programming Languages and Systems, Vol. 8, Number 4, 1986, pp. 547- 576. D. Clrment & L. Hasco~t, "Centaur Paths: a structure to designate subtrees", CENTAUR User's Manual, Version 0.9, 1989. V. Donzeau-Gouge, G. Huet, G. Kahn & B. Lang, "Programming environments based on structured editors: the Mentor experience" in D.R. Barstow, H. E. Shrobe & E. Sandewall (eds.), Interactive Programming Environments, McGraw-Hill, 1984, pp. 128-140. M.H.H. van Dijk & J.W.C. Koorn, "Implementation of a generic syntax- directed editor", Fourth Annual Report ESPRIT Project GIPE, 1989. M.H.H. van Dijk & J.W.C. Koorn, "GSE User's manual", CENTAUR User's Manual, Version 0.9, 1989.

123

[Des84]

[ES88]

[FGJM85]

[GENESIS87]

[HHKR89]

[HK89a]

[HK89b]

[HKKL85]

[HKR87a]

[HKR87b]

[HKR89]

[Hen88]

[I-Ien89]

[HN86]

[KLMM83]

[Kah87]

T. Despeyroux, "Executable specification of semantics", in Semantics of Data Types, G. Kahn, D.B. MacQueen & G. Plotkin (eds.), Lecture Notes in Computer Science, Vol. 173, Springer-Verlag, 1984, pp. 215-233. R. Endres & M. Schneider, "The GRASPIN Software Engineering Envi- ronment", in ESPRIT "88: Putting the Technology to Use, North-Holland, 1988, pp. 349-364. K. Futatsugi, J. A. Goguen, J.-P. Jouannaud, and J. Meseguer, "Principles of OBJ2", in Conference Record of the Twelfth Annual ACM Symposium on Principles of Programming Languages, ACM, 1985, pp. 52-66. "An Overview of Genesis", ESPRIT Project 1222 (GENESIS), Deliver- able 12Y3, 1987. J. Heering, P.R.H. Hendriks, P. Klint & J. Rekers, "The syntax definition formalism SDFareference manual", Report CS-R8926, Centre for Mathematics and Computer Science, Amsterdam, 1989. J.Heering & P. Klint, "The syntax definition formalism SDF", in [BHK89, Chapter 6]. Also in ESPRIT '86: Results and Achievements, North-Holland, 1987, pp. 619-630. J. Heering & P. Klint, "PICO revisited", in [BHK89, Chapter 9]. Also in ESPRIT '88: Putting the Technology to Use, North-Holland, 1988, pp. 365-379. J. Heering, G. Kahn, P. Klint & B. Lang, "Generation of interactive programming environments", in ESPRIT "85: Status Report of Continuing Work, Part I, North-Holland, 1986, pp. 467-477. I. Heering, P. Klint & J. Rekers, "Principles of lazy and incremental program generation", Report CS-R8749, Centre for Mathematics and Com- puter Science, Amsterdam, 1987. J. Heering, P. Klint & J. Rekers, "Incremental generation of lexical scanners", Report CS-R8761, Centre for Mathematics and Computer Science, Amsterdam, 1987. J. Heering, P. Klint & J. Rekers, "Incremental generation of parsers", Jr, Proceedings of the SIGPLAN '89 Conference on Programming Language Design andlmplementation, 1989, pp. 179-19I. P.R.H. Hendriks, "ASF system user's guide", Report CS-R8823, Centre for Mathematics and Computer Science, Amsterdam, 1988. P.R.H. Hendriks, "Type-checking Mini-ML", in [BHK89, Chapter 7]. Abbreviated version in Proceedings of CSN87: Computing Science in the Netherlands, SION, 1987, pp. 21-38. A.N. Habermann & D. Notkin, "Gandalf: software development environments", IEEE Transactions on Software Engineering, Vol. 12, 1986, pp.1117-1127. G. Kahn, B. Lang, B. M61~se & E. Morcos, "METAL: a formalism to specify formalisms", Science of Computer Programming, Vol. 3, 1983, pp. 151-188. G. Kahn, "Natural semantics", in Fourth Annual Symposium on Theoreti- cal Aspects of Computer Science, ed. F.J. Brandenburg, G. Vidal-Naquet,

124

[Kli89]

[Log88]

[vdM88]

[RT89]

[Rek89]

Woi86]

and M. Wirsing, Lecture Notes in Computer Science, Vol. 247, Springer- Verlag, 1987, pp. 22-39. P. Klint, "Scanner generation for modular regular grammars", in Liber Amicorum, J.W. de Bakker, 25 jaar Semantiek, Centre for Mathematics and Computer Science, Amsterdam, 1989, pp. 291-305. M. Logger, "An integrated text and syntax-directed editor", Report CS- R8820, Centre for Mathematics and Computer Science, Amsterdam, 1988. E.A. van der Meulen, "Algebraic specification of a compiler for a language with pointers", Report CS-R8848, Centre for Mathematics and Computer Science, Amsterdam, 1988. T. Reps & T. Teitelbaum, The Synthesizer Generator: a System for Con- structing Language.based Editors, Springer-Verlag, 1989. J. Rekers, "Modular parser generation", Report, Centre for Mathematics and Computer Science, Amsterdam, to appear, 1989. F. Voisin, "Cigale: a tool for interactive grammar construction and expression parsing", Science of Computer Programming, Vot 7., 1986, pp. 61- 86.

PART II Requirements and Design

I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

I n t r o d u c i n g F o r m a l Requ i r emen t s in to I n d u s t r y . . . . . . . . . . . . . . . . . 129

1 I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 2 E R A E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 3 Indus t r i a l expe r imen t s a n d usage . . . . . . . . . . . . . . . . . . . . 131 4 T h e SPESI -2 expe r imen t . . . . . . . . . . . . . . . . . . . . . . . . 133 5 Conclus ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

W h e r e can I Ge t Gas R o u n d Here? - A n A p p l i c a t i o n of a Des ign M e t h o d o l o g y for D i s t r i b u t e d Sys tems . . . . . . . 143

1 M e t h o d o l o g y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 2 P r o b l e m desc r ip t ion . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 3 R e q u i r e m e n t speci f ica t ion . . . . . . . . . . . . . . . . . . . . . . . . 146 4 Des ign speci f ica t ion . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 5 Conc lus ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

T rans fo rma t ions of Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

1 I n t r o d u c t i o n a n d m o t i v a t i o n . . . . . . . . . . . . . . . . . . . . . . 167 2 Designs: def ini t ions and p rope r t i e s . . . . . . . . . . . . . . . . . . . 168 3 Top-down example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 4 Modi fy ing componen t s . . . . . . . . . . . . . . . . . . . . . . . . . . 178 5 C o m b i n i n g designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 6 Des ign evo lu t ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 7 Des ign c rea t ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 8 Des ign p a r t i t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 9 Conclus ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

10 Acknowledgemen t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

I n t r o d u c t i o n

The paper by Hagelstein and Ponsaert discusses a question that has been of fundamental importance to METEOR: How will the industrialization of formal methods take place if at all? Hagelstein and Ponsaert describe their experiences and views based on cooperation with the Consumer Electronics division of Philips.

The paper by Weber uses a stream oriented theory of concurrent systems due to Broy and shows an integrated path from system analysis and requirements description to design. The method is an alternative to ERAE, but shares with ERAE the focus on an independent specification of a system's environment.

The paper by Feijs introduces designs and transformations of designs based on a formal notion of a component. This concept of a component has been integrated in COLD. Feijs describes black box and glass box correctness for designs and explains how to understand bottom-up and top-down design methods in these terms.

Introducing Formal Requirements into Industry

J. Hagelstein Philips Research Laboratory

F. Ponsaert

Philips Centre for Sof tware Technology

Avenue Albert Einstein, 4

B-1348 Louvain- la -Neuve , Be lg ium

Abstract

We draw some lessons from our attempt to introduce the formal requirements engineering

language ERAE in an industrial context. We review the various experiments and comment on

such issues as the typical deficiencies of current practices, misconceptions about the nature

of requirements, our approach to technology transfer, the importance of methodological

guidance, and the role of tools. One of the applications, a complete television set, is analysed

in more detail.

1 Introduction

The requirements engineering process is one of the most critical during software production. Errors in the

definition of requirements are often only discovered when a first version of the final product is available.

This is why dedicated languages have been proposed for supporting this activity (SA, SREM, SADT, etc),

and more recently, formal requirements engineering languages have started appearing. Among these, the

ERAE language, which has been designed in the ESPRIT project METEOR, combines an extension of

the Entity-Relationship model with a variant of temporal logic.

More than a language, ERAE is also a method, with innovative features. It shifts the main focus

during requirements engineering from the needed system towards the application domain and the problem

to be solved. Only in a later step does it consider the computer system, as a solution to the identified

problem. And even then, this system is considered on an equal footing with the environment, which plays

a complementary role in solving the problem. The method also emphasises the importance of analysing

the evolving requirements, in particular by means of logical deductions which are available in ERAE.

Because of the novelty of both the language and the method, the introduction of ERAE in an industrial

environment is quite a challenge. In the subsequent sections, we report on the use of ERAE in pilot

130

industrial projects. Section 2 gives a short introduction to the ERAE language and method. Section 3

provides an historical perspective of the introduction of ERAE in industry. One of the experiments is

discussed in detail in Section 4 and we finally present some conclusions.

2 ERAE

2.1 T h e L a n g u a g e

The ERAE language is briefly presented in [4] and with more detail in [7]. It is formally defined in [8].

In this section, we only present and motivate its main characteristics.

The language has been designed to facilitate the knowledge acquisition process underlying require-

ments elicitation, with emphasis on simplicity, expressiveness, and formality :

• Simplicity. The ERAE language provides few but powerful constructs. An ERAE specification

consists of two parts: a graphical one and a textual one. The first part identifies the objects

under discussion and their relationships. The use of graphics is very attractive for describing these

concepts which are naturally organised in a graph. The second part of a specification describes the

possible configurations of these objects and their possible changes. It uses temporal logic, extended

to express real-time constraints.

• Expressiveness. The language was designed to easily express natural language sentences, that

people use to initially state requirements. Sentences like 'there is never a menu on screen when

teletext is in operation' or 'the furnace should be started within 2 seconds' are directly expressible.

• Formality. This property addresses the problem of avoiding misunderstandings during require-

ments definition. First, a formal language avoids the ambiguity of informal notations, but also,

it provides a sound intellectual tool for organising one's thought, and it allows for sophisticated

automatic support, including consistency and completeness checks, simulation tools, and capability

of reasoning about the requirements.

ERAE is also a structured language, as some objects may have internals, hidden from the others [5,7].

Objects may communicate with each other by sharing events (synchronous communication) or by shared

access to lasting objects (asynchronous communication).

2.2 The method

The method is described in [7] and illustrated in several papers reporting on case studies [2,3,6,9].

It provides guidelines for organising the elaboration of requirements into phases, as well as specific

techniques for checking that each phase is performed correctly.

131

The guidelines for the overall organisation require that the problem be understood before a solution

is looked for. Therefore, the method insists that the environment of the future computer system be

investigated first, and that the objectives of introducing this system be identified. These objectives are

then refined into a computer behaviour and, possibly, modifications in the environment. The structuring

mechanisms of the ERAE language support this organisation of the requirements.

Within each phase, the method provides heuristics for the correct use of the language constructs,

checks which indicate likely errors, rules for generating consequences of the requirements gathered so

far, etc. Most of this relies on the formal nature of the language.

3 Industrial experiments and usage

In the process of defining the ERAE language and method, a number of classical academic case studies

were performed. Some of these were reported in project deliverables and publications. The list of

applications includes a lift control [3], a conference organisation [2], a telephonic transit node [6], a

vending machine, a furnace control, a library management, a stock handling system, and a tool supporting

software development [9].

These activities brought ERAE to a sufficient degree of maturity to consider try-outs in actual in-

dustrial conditions. Of course, the classical problems of introducing a new technology are well-known :

conservatism, fear of risks, need for education, etc. To overcome them, we adopted the following stepwise

approach :

We have disseminated the knowledge about ERAE through internal courses on software engineering,

within our company. A short lecture on ERAE has been given, as part of these courses, since

February 1987. In the Fall of 1988, a two-day course has been organised to provide a deeper

introduction.

2. Our first contacts have been with people who had attended these courses. They were faced with

complex requirements for embedded systems, often including real-time aspects, and took the ini-

tiative to contact us.

3. A first contact typically results in an experiment of re-specification of some existing product. It

is performed by the ERAE experts alone, and then presented to the industrial partner. The first

experiment of this kind started in October 1987.

4. The second step results in a specification, still by the ERAE experts, but in parallel with a real

project.

5. The next step requires an increased investment from the industrial partner. On the basis of the

first try-outs, it decides to devote some man-power to a case study, in collaboration with ERAE

experts. This is still in parallel with a real project.

132

6. Finally, experiments are done by the people in the industry and are possibly reviewed by ERAE

experts.

This approach resulted up to now in 6 projects, each consuming 3 to 6 man-months on the side of

the ERAE experts. These experiments will be reviewed briefly in the next sections.

3.1 Studio Booking System

The studio booking system [11] was the first industrial try-out of the language. It was performed by an

Irish software house called C.O.P.S., which participated by that time to the METEOR project. It gave

essential early feedback to the design of ERAE.

3.2 T I P T O P

The first project within Philips was concerned with a tetetext processing module, called T1PTOP [10].

It had been specified using the Yourdon method and the specification did not give full satisfaction. The

re-specification of the requirements aimed at comparing the ERAE solution with the existing one.

The following points emerged from the comparison:

• the formal statements of ERAE avoid the ambiguity of the informal minispecs;

• the inheritance mechanism in ERAE solved some difficulties encountered in decomposing Yourdon

datastores;

• the explicit modelling of the environment was felt to be more satisfactory than the use of Yourdon

terminators;

• the typing mechanism of ERAE was appreciated;

• the ERAE specification was less biased towards design than the Yourdon one.

However, the existence of a tool support for Yourdon was felt to be a very strong point in its favour.

3.3 Satellite Tuner

As the try-out on TIPTOP was successful, the same department decided to use ERAE for their next

product, a satellite tuner specified in ERAE without the help of the ERAE experts. The users were again

satisfied, although a later review showed that their use of ERAE was not quite correct. Actually, the

conclusions were that the concepts of the ERAE language and the general methodological approach are

already helpful, even without an entirely correct use of the language. This correct use could be enforced

by an appropriate tool support.

133

3.4 SPESI-1

As a result of the TIVIDP and Satellite Tuner projects, a larger project was launched, in which formal

techniques would be combined with other advanced techniques. The objective was the specification and

simulation of part of a television system, using:

• an hypertext system, to prototype the user interface,

• ERAE, to formalise the requirements specification,

• the design language COLD, to map the specification to existing simulation units.

The project SPESI-I (SPEcification and Simulation) [1] was started in October 1988. Within three

months, the animation, specification and simulation were ready.

3.5 SPESI-2

SPESI-2, the follow-up of SPESI-1, will be described in detail in Section 4.

3.6 B L I S S

The BLISS project shows clearly the difficulties faced by new methods and languages. This very big

project aims at developing an on-line worldwide distributed system for stock management and delivery

scheduling. In spite of a rigorous approach to requirements engineering, problems had been encountered

with certain real time aspects of the problem. The applicability of ERAE was shown on an excerpt of

the hard part of the project, but eventually ERAE was not used because:

• the analysts had just learned another language,

• ERAE was new and the risk could not be afforded,

• ERAE lacked tool support.

It was agreed, however, that ERAE would be used in parallel with the real BLISS project, by ERAE

experts.

4 The SPESI-2 Experiment

The SPESI-2 project, a follow-up of SPESI-1, has taken place from April 89 to March 90. The main

differences with SPESI-1 were that the case study was larger, as it covered a complete television system,

and that technicians not familiar with formal methods participated in the writing of specifications.

This section uses SPESI-2 as a vehicle for illustrating a number of lessons learned regarding the

deficiencies of informal requirements and the main benefits of using ERAE in this case.

134

4.1 Analysis of typical informal requirements

We will use the expression "the document" to denote the informal requirements specification, which was

our main source of information. It consists of around 100 pages of English text, plus a few tables and

figures.

Although this document suffered deficiencies, as we will see, the quality of the resulting television was

actually never endangered because quality assurance procedures were in place to provide the necessary

feedback. One of the objectives of the experiment was to shorten this feedback loop. Moreover, the quality of the document is actually rather high, given that it is simply written in English. It is indeed

difficuk, without a formal approach, to avoid such deficiencies as inconsistency, lack of structure, over-

specification, incompleteness, ambiguity, and redundancy, which are reviewed and illustrated below.

4.1.1 Inconsistencies

To our surprise, the most frequent inconsistencies cannot be considered errors. They are manifest con-

tradictions, but which anybody knows how to resolve, because a kind of implicit exception mechanism

is assumed.

Example : The initial status of the television when powered on is carefully described in

one section : program one is selected, the personal preference values of various settings are

established, etc. This is all contradicted in another section, which says that the television is

left in standby when switched on if the "parental mode" is in operation.

It is clear, although implicit, that something exceptional occurs in parental mode, and that

the section about the initial status is only meant to describe the normal case.

Even if this kind of contradiction is frequent in the normal use of natural language, the document

may at least be criticised for its lack of cross-references in this case. Indeed, if the contradiction is easy

to resoIve, it is still hard to identify.

True inconsistencies occur when the resolution between conflicting statements is not obvious any

more. This kind of inconsistency was not frequent in the document.

Example : The user of the television may define preferred values for some settings (volume,

balance, etc). The list is however different in the various places where it occurs. In particular,

the settings "spatial" and "saturation" are called "expand" and "colour" elsewhere.

4.1.2 Lack of structure

The document is structured, but around the planned architecture of the television. The user-observable

behaviour must be deduced by the reader from the description of various components of this architecture.

135

Moreover, the various functionalities have implicit and obscure interactions, which prevent the easy

understanding of the document. It is almost impossible to study a specific functionality independently of

the others.

Example : Most user commands have different effects, depending on the current "mode" of

the television (teletext, standby, etc). Fourteen modes are mentioned; here and there in the

document, but no exhaustive list is given, nor is the interdependency of modes explained.

Still, some modes turn out to be exclusive, and some are just submodes of others.

Other deficiencies can be traced back to the lack of structure : redundancies, implicit exceptions, and

some occasional spreading of related information.

Example : The continuous display of the program number on the screen is described piece

by piece, in various places. On page 22, one learns how to set it; on page 29, that it does not

work if an external source is selected; on page 9, that a power-on restores it to its previous

value; etc.

4.1.3 Over-specification

As observed above, the document describes more than the functionalities of the television. It specifies

its internal realisation architecture, the behaviour of each component, and the communication between

them.

This kind of information should not be excluded from a requirements document, but it should be

side-information limited to a section about already known design decisions. The problem is that the

document is organised around this design-oriented information, which leads to an unclear perception of

the external behaviour of the television.

4.1.4 Incompleteness

Incompleteness is a problem of medium importance in the document. When a question arises, the difficulty

is rather to find where the answer is (because of the lack of structure). There are cases, however, where

the answer is just missing.

Example : The state of the television leaving standby is only partially specified. The doc-

ument does not say, for instance, what should be the values of the various audio/video

parameters (contrast, balance, etc).

A typical case of incompleteness is the description of unusual situations. The behaviour of the

television is clear when its user issues sensible commands, but little is said when a natural sequence of

commands is interrupted by a completely unrelated command.

136

4.1.5 Ambiguity

This is a quite minor deficiency : unclear sentences were rather rare in the document.

Example : Some actions are required "after completion of a program selection". Our first

interpretation of this condition was "when the tuner has tuned to the fight frequency". Ac-

tuaUy, the intended interpretation was "when the user of the television has completed the

sequence of button hits requesting a program selection". Indeed, a program selection may

require a single button hit (e.g. 'program up') or a sequence of hits (e.g. switching to double

digit entry, selecting first digit, selecting second digit).

More often, the problem is one of terminology, with different expressions referring to the same thing.

Example: A condition which is called "no video displayed" in one place is called "no antenna

signal" elsewhere.

4.1.6 Redundancy

The repetition of information is not really a problem, except in the case of inconsistency between the

various occurrences. However, the fact that some information is repeated should at least be made explicit.

Even better, the repetition should be replaced by a pointer towards the unique location of the information.

Example : The effect of pressing a digit key in the standby mode is explained three times :

once when the program selection function is described, then as a remark when the initial state

at power-on is specified, and finally when the hardware of the power supply is described.

4.2 The benefits of using ERAE

In this section, we review how the various deficiencies above have been addressed, and what additional

benefits resulted from the use of ERAE in this case.

4.2.1 Eliminating the deficiencies

Only one of the deficiencies - ambiguity - is eliminated as a direct consequence of the use of a formal

language. Indeed, a formal semantics, like the one of ERAE, provides a unique mathematical inter-

pretation for any text in the language. As for the ambiguities due to a lack of terminology, they have

been eliminated very soon, as the ERAE method suggests to formalise the vocabulary of the application

(graphically) before detailing the desired behaviour (textually). If a support environment including a type

checker had been available, the benefit would have been maximal. But even so, the unified terminology

has resulted in easier communication between the participants. Untbrtunately, ambiguity is here a quite

137

minor deficiency. Addressing the other deficiencies is not only a matter of using a formal language, but

also of using the right one and using it properly.

One main problem of the document is the tack of structure. During the project, we have first formalised

the document as it was, before understanding that it had to be completely reorganised. To remain close

to the technicians' natural view, the new structure was based on the modes of the television, already

mentioned in the document. The specification finally consisted of the following parts :

• a hierarchy of modes,

• the transitions between modes induced by user commands,

• the functionalities associated to user commands in the various modes,

• a separate description of each functionality, with clear inputs and outputs, and well-identified

interactions.

This new organisation allows the reader to understand the requirements in a piecewise manner. The part

of the ERAE language that proved critical for reflecting this structure is the concept of 'context' [5,7],

which provides a visibility control mechanism.

Besides this, various techniques for analysing ERAE requirements address the problems of redun-

dancy, incompleteness, and inconsistency :

A first technique is simply the use of the search facility of a word processor, to find the various

places where a given term is used, and hence to eliminate the redundancy or to establish the missing

cross-references.

More elaborate analysis techniques are suggested by the ERAE method, but not yet supported by

tools. One of them is a systematic way to review all statements associated to a declaration. For

each declared object or association, one should provide (1) an initial state, (2) the causes of its

changes, and (3) the effects induced by its changes.

Another example is an analysis technique which generates some information, on the basis of the

already available one. If some property is known to be 'passive', i.e. never to change spontaneously,

there are techniques for deriving the causes of its changes from the effects of all other changes.

In this experiment, the technique proved to be very helpful in analysing the specification for

inconsistency (incompatible changes do not have incompatible causes) or incompleteness (no cause

is found for a change which should be possible).

A remaining problem, over-specification, is addressed, as explained in the next section, by expressing

the requirements in terms which are meaningful to a user of the television.

138

4.2.2 More user-oriented requirements

ERAE is not only a language, but also a method, reflecting a certain philosophy of requirements engineer-

ing. This has induced a reflection about this activity, as currently practiced by our industrial partner. It

turned out that various ideas underlying ERAE were found to receive insufficient emphasis, in particular:

• the initial focus should be on the problem, and not its solution;

• the problem should be stated in user terms, and use the terminology of the application domain;

• the application domain should be modelled explicitly.

The technicians participating in the formalisation activity quickly found out that the informal require-

ments were more design-oriented than user-oriented. This resulted in a redefinition of our target, with the

new objective of providing a specification of the behaviour of the television, as perceived by the user.

The explicit modelling of the environment was covered by the definition of a reference model for

andio/video applications, which is discussed in the next section.

4.2.3 A reference model

One of the goals of the project was the definition of a reference model for audio/video applications. This

should be seen as a reusable model of this application domain, including :

• a structural description of the usual inputs : user events, information in broadcasted or recorded

programs, structure of teletext signals, etc.

• a description of the usual outputs of the devices, i.e. the user-visible phenomena: structure of

screens, of sound, LEDs, etc.

These descriptions depart totally from the available technical specifications (infrared signals emitted

by the remote handset, modulation of the broadcasted signals, protocols for transmitting the teletext

information, etc). We developed much more abstract information models, only providing the structure of

the information, and reflecting the user perception of the application domain. For instance, a broadcasted

signal is characterised by a frequency, a standard (pal, secam, etc), image and sound signals, teletext

page receptions, etc.

This application model is meant to be used in the description of various specific products. These are

specified as active objects establishing a relation between user-visible inputs and outputs.

4.2.4 Questioning some requirements

The formalisation of the requirements has revealed some questionable decisions, which can often be

traced back to unexpected interferences between functionalities.

139

Example : The balance control does not work when the loudspeakers are muted. This looks

like a reasonable decision, but it turns out that earphones still work when the loudspeakers

are muted, to permit 'private' listening, without disturbing other persons around. There is

then no way to change the balance when earphones are used for private listening.

4.2.5 Reusability

Reusability has been a growing concern in the course of the project. Of course, an isolated experiment

may hardly lead to firm conclusions on this matter, but we tried to favour reusability through the following

measures :

• the reference model, described above, is unlikely to change in the near future. Broadcasting

standards, for instance, are not expected to be deeply modified.

• a structured specification, which identifies functional parts of a television, gives room for reusability.

Some of the functional parts may well remain unchanged in the next television (e.g. the whole

teletext control part). The clear identification of the external interface of these parts is critical to

their reuse.

5 Conclusions

Several strategies are possible for introducing formal requirements into industry, for instance using

managerial authority to enforce it, or developing a small team of experts applying the formal approach

in difficult cases only. The strategy we followed is different: it is based on an active attitude of the

industrial partners, who should first express a need for some improvement. This is then followed by a

progressive set of experiments, with increased involvement on their side.

While applying this strategy, we. learned a lot. First, it was a surprise that the main problem with

current practices is not the ambiguity of informal languages, which is pretty well mastered. It is, instead,

the difficulty of achieving a structured and complete view of the requirements. The lack of appropriate

structure induces the following consequences (in decreasing order of importance) :

• impossibility to study the requirements in a piecewise manner, by tack of clearly identified inter-

action between functionalities,

• frequent contradictions, easy to resolve but hard to discover,

• spreading of information,

• duplication of information.

The incompleteness and the lack of a well-defined terminology are problems of medium importance, and

ambiguity is a minor problem.

140

The remarkable point is that formality in itself only addresses ambiguity, which is the least problem.

If formal languages are to be accepted in industry, they must first bring an answer to the problem of

unstructuredness. To this end, the proposed language must include appropriate structuring mechanisms,

which clarify the interfaces between loosely coupled parts. But this necessary precondition is far from

sufficient. As important is the availability of a method guiding the proper use of these mechanisms. The

problem, indeed, is not that users lack a way of expressing a perceived structure, but rather that they

Cannot arrive at it. The experiment above suggests that some application domains have a typical structure

which, if worked out once, can then be easily adapted to similar products.

The second main deficiency that we encountered in the industrial practice of requirements, is a strong

bias towards design, with the consequence that the external interface of the specified system is obscure.

An important benefit of using ERAE is the cultural change resulting from its emphasis on application

domain modelling and problem identification.

As our industrial partners have been involved in the writing of specifications, we can also draw some

conclusions about the teaching of formal languages and methods. The easiest way to use such a new

technique seems to be by imitation: our partners started specifying parts that were very similar to others,

already specified by ourselves. A strong interaction with experts is very important to quickly correct bad

practices. In addition, a course, with fully worked out examples and detailed hints for the use of the

various language constructs, is absolutely needed.

As for the tool support, it is worth noticing that the experiments reported here were performed without

a specialised environment, but only with general-purpose textual and graphical editors. Better support

only becomes necessary for the routine use of a method. This is a fortunate fact, as it allows to postpone

heavy investments in tools until the method has been shown to be applicable.

Acknowledgement: This work is partly funded by the CEC under the ESPRIT project METEOR. The

work of the Centre for Software Technology within METEOR is done on behalf of APT. The design of ERAE

has involved E. Dubois, J. Hagelstein, E. Lahou, F. Ponsaert, A. Rifaut, E. Stephens, and F. Williams.

We would also like to thank W. Christis, H. Obbink, F. Stommels who made the industrial experiments

possible, as well as all participants to these experiments, especially L. Loomans, E. Baljeu, J. Polstra and

R. Schurmans.

References

[1] L. Claeys, L. Loomans, F. Ponsaert, "ERAE Specification of DI6 and D2B, ' ' Philips CST report re89015, January 1989.

[2] E. Dubois, J. Hagelstein, E. Lahou, F. Ponsaert, A. Rifaut, F. Williams, "The ERAE model: A Case Study," in: T.W. Olle, H.G. Sol, A.A. Verrijn-Stuart (eds.), Information System Design Methodolo- gies: Improving the Practice, North-Holland, 1986, pp. 87-105.

[3] E. Dubois and J. Hagelstein, "Reasoning on Formal Requirements: a Lift Control System," Pro- ceedings of 4th International Workshop on Software Specification and Design, Monterey, California, 1987.

141

[4] E. Dubois, J. Hagelstein, A. Rifaut, "Formal Requirements Engineering with ERAE," PhilipsJournal of Research vot. 43 3/4, 1988.

[5] A. Finkelstein and J. Hagelstein, "Formal Frameworks for Understanding Information System Re- quirements Engineering," Proceedings of the IFIP WG 8.1 CRIS Review Workshop, Sesimbra (Por- tugal), 1989.

[6] J. Hagelstein and E. Lahou, "A transit Node in ERAE," METEOR report MET-199, September 1987.

[7] J. Hagelstein, A. Rifaut, J. Vangeersdael and M. Vauclair, "The ERAE Language and Method," Manuscript M 336, Philips Research Laboratory Belgium, 1990.

[8] J. Hagelstein and A. Rifaut, "The ERAE Language Definition," Manuscript M 337, Philips Research Laboratory Belgium, 1990.

[9] E. Lahou and F. Ponsaert, "Case Study: Requirements for a Software Development and Maintenance Tool," METEOR report MET-110, September 1986.

[10] F. Ponsaert "TIP-TOP: An Experiment with ERAE," METEOR report, June 1988.

[11] E. Stephens and F. Williams, "A Case Study in Requirements Engineering. Studio Booking," ME- TEOR report MET-108, January 1986.

Where can I get gas round here? - An Application of a Design Methodology

for Distributed Systems

Rainer W ~

Fakul~t f'tir Mathematik und lnformatik, Universi~t Passau

Postfach 2540, D-8390 Passau

Abstract

The first two phases, viz. requirement specification and design specification, of a design

methodology for distributed systems are applied to the specification of a gas station. The

methodology which is ~ on streams of actions is explained for this example, problems arising

are discussed. Special attention is paid to the structuring of specifications.

1. Methodology

Many researchers are convinced that for the design of distributed systems a design methodology based on

formal methods is necessary. Arguments for this point of view may be found in [Broy 88a], [Broy 88b]

and [Lamport 89] for example.

We present a particular design methodology and apply it to the specification of a gas station. The

methodology is according to Manfred Broy ([Broy, Streicher 87], [Broy 88b]) and is based on streams of

actions. On the way from a given problem to a program four phases are distinguished:

1. Requirement Specification

2. Design Specification

3. Abstract Program

4. Concrete Program.

The steps from one phase to the next can be supported by formal means or at least be formally verified.

(For more information again see [Broy 88a] or [Broy, Steicher 87].)

In this text we concentrate on the early stages of the development, thus only phases 1 and 2 are treated

here. The description refers to the state of the art of this methodology, a lot of work is done in its further

development.

144

In contrast to previous examples (e.g. [Broy, Streicher 87], [Broy 88a]), the central issue here are

questions of methodology, verification is skipped. So special attention is paid to questions like:

1. How easy is it to express requirements?

2. How could specifications be structured?

3. How could the transition from one phase to the next be made easier?

The answers may indicate future research topics.

The paper is organized as follows: We first present the informal requirements, followed by a discussion

about a reasonable serving strategy (section 2). These requirements are the formalized and the resulting

specification is structured (section 3). Two alternative design specifications are given (section 4). Finally

the experience gained is presented (section 5).

2. Problem Description

2.1 Informal Requirements

We consider a gas station with two pumps A and B and three queues L1, L2 and L3 which are

assumed to be of finite length N (Fig. 1):

Lx \

L2 I A[ BI

L3 ..... /

Figure 1

(This problem is due to F. KrOger [Krtger 87].)

A controller of the gas station is to be built such that the following given informal requirements are met:

1) All cars in the queues are served (either at pump A or at pump B).

2) Cars in queue L1 can only be served at pump A, cars in queue L3 only at pump B,

but cars in queue L2 can be served at pump A or at pump B.

3) The cars in each of the queues are served according to the FIFO strategy.

4) At each pump only one car at a time can be served.

5) If a queue is full and a car arrives there, then it is rejected, otherwise it is accepted.

Condition 1) is a liveness property whereas conditions 2) to 4) are safety properties. The classification of

condition 5) is not quite clear. On the whole it is a safety property, but it has also some liveness content in

it.

145

2 . 2 S e r v i n g Strategies

In the previous section the serving strategy has intentionally been left open. We want our strategy not to

prefer any of the lines against the others. So a reasonable claim would be

All lines are reduced regularly, none taking precedence over the others.

But assume the service times for the lines are different, e.g. in line L1 there are only trucks, which do not

only consume more gas, but also more time. Thus waiting situations at the other lines would have to occur

to get a regular reducing of all three lines.

A more suitable requirement is:

6) Whenever two service starts are possible at some pump, then the car that has arrived first is preferred.

For example, when pump A gets empty and lines L1 and L2 are not empty, both the first car in line L1

and the first car in line L2 could be served. This conflict is resolved according to 6) by preferring that car

that has arrived first.

This is a more operational approach to getting an appropriate serving strategy. However, if we think of a

design or implementation, the order of arrival must be stored somehow.

Yet them are also other service strategies, having different features, for example:

1) Prefer L2

- After a Ll-service (i.e. after the first car from line LI has been served) a L2-service follows at pump A.

- After a L2-service atpump A aLl-service (at pump A) follows.

- The analogous applies to line L3 and line L2.

Obviously by this strategy line L2 gets a preferential treatment. However, this strategy implies a simple

control mechanism.

When one of the lines gets empty, it is not taken into consideration, i.e. cars from the other possible line

are taken instead. For example, when line L1 gets empty and L2 is not empty, only cars from line L2

are served at pump A.

Analogous conditions also hold for the next service strategy given below.

2) Kr6ger's strategy

- After a Ll-service-start a L2-serviee-start follows before the next Ll-service-start.

- After a L2-service-start a Ll-service-start follows before the next L2-service-start.

- After a L2-service-start a L3-service-start follows before the next L2-service-start.

- After a L3-service-start a L2-service-start follows before the next L3-service-start.

146

However, unnecessary waiting situations may occur, as the example illustrated in Fig. 2 shows:

1 (*)

A [ [,

I

I 3 (**) B ! I

'~ 1? No, because in that case a L2-service-start should have taken place between (*) and now

2? No, because in that case a L3-service-start should have taken place between (**) and now

Therefore a waiting situation occurs in the marked interval

.x"-.\\\'%Y

I Figure 2

At the two horizontal lines we are told what happens at pump A and B respectively. The small vertical lines

mark events, namely the starts and stops of a service. Between two consecutive vertical lines it is reported

which line is served at the corresponding interval of time.

Therefore we consider the requirement 6) to be more appropriate.

Two things should be stated:

1. Serving strategies are a central issue in computer science (e.g. processor allocation in the field of

operating systems), but also in other areas (scheduling problems in production engineering).

2. We observe that the strategies need a careful examination, to understand the behaviour implied by these

strategies. With more intricate problems, a discussion at this informal level may be not feasible. Reasoning

with the formal requirements (cf. next section) seems to be sensible.

3. Requirement Specification

3.1 Actions, Streams of Actions, and Operations on Streams

The underlying paradigm of our way to specify the requirements is that an observer notices actions

happening. Thus the intended behavior of the system can be described by the set of possible sequences of

actions (we also call them streams of actions), taking an interleaving viewpoint.

An alternative approach, being in some way dual, would be to describe the behavior of the system by

sequences of states. This is the case when using temporal logic as a means for description. Indeed, the

temporal logic approach is very much related to our approach, because both belong to the same level of

abstraction. The main advantage, as opposed to temporal logic is, that we have more powerful means to

talk about sequences, e.g. counting operators that are missing in the established temporal logic formalisms.

147

We consider the system to be closed and therefore also include the actions by the environment, although at

this point the need to distinguish between input and output actions does not yet exist. At this earliest stage

of development no effort is made to provide a functional view of the system.

Actions are specified in our formalism by an abstract data type (for abstract data types cf. [Wirsing et al.

83]):

type ACTION =

BASED ON NAT

sort pump,

sort line,

sort car,

sort act,

fct pump A, B,

fct line L1, L2, L3,

fct nat N,

end

{ maximum queue length }

fct(car,line)aet arrive, accept, reject,

fct(car,pump p, line L: --1 (p = A ^ L = L3) ^ ~ (p = B ^ L = Lt))act start,

fct(car,pump)act stop

of type

No axioms are given here. If not only the initial model of this type is considered, the axioms A ~ B, L1

L2, L2 ;~ L3 and L1 ~ L3 are necessary.

Streams of actions are defined as follows:

act c° =def act* to act o*

act* denotes finite sequences of actions whereas act °° denotes infinite ones. For an algebraic specification

of this type of. [Broy 88b]. Thus our framework is completely algebraic. The operator symbol o is used

for both concatenation of two streams and prefixing a stream with an action, thus implicitly regarding an

action as a stream of length one. In this paper analogously the notation s c° is used for streams of sort

s, with s being not necessarily the sort act.

Correct behaviours of the system will be described by sets of action streams. These sets are given by

predicates on streams.

In order to be able to talk about streams of actions, several special relation (rel) and function symbols (fct)

are introduced. Some of them are defined in [Broy 88b], again by means of algebraic specifications, and

here only an informal explanation is given. The symbols in and © are used in infix notation.

a) r e l ( ac t , ac t c°) .in.

a in s: action a appears in the stream s

b) fct(act,act~O)act c° .©.

aOs: This is the stream which results by removing all actions not equal to action a from the stream s

148

c) fct(aetCg)nat** #

#(s): length of the stream s

d) r e l ( a e t t ° , a e t t°) =-

s _= t: stream s is a prefix of stream t

e) for(act)car get_car

This defines a kind of projection function:

Examples: geLcar(arrive(c,L1)) = c, geLcar(start(c,A,L3)) = c

Note that (actC°,--.) forms an (algebraic) complete partial order with the empty stream as least element.

Remarks:

1) a) and b) can easily be expanded: take sort set act (set of actions) as first argument instead of sort

act. E.g., m in s then holds whenever an element of m appears in the stream s.

2) e) can be expanded by replacing act by act °~ with componentwise application of the original

function (homomorphic extension). E.g.:

get_car(arrive(el,L1) - Start(c2,A,L1) • stop(c2, A)) = cloc2°c2

Also these expansions are used in the sequel.

3.2 The F o r m a l R e q u i r e m e n t s

A predicate "gas_station" is given that describes the correct behaviours of the system gas station. This

predicate is a translation and precise formulation of the requirements given in section 2.1.

It uses three auxiliary predicates reql , req2 and req3, which are introduced for structuring reasons.

gas_station(s) =clef reql(s) ^ req2(s) ^ req3(s)

Reouirments 1 and 3:

"All cars in the queues are served (either at pun~ A or at pump B)."

"I'he cars in each of the queues are served according to the FIFO strategy."

We combine these requirements and translate them into just one formula (reql(s) ):

reql(s) =def 3 fe t (ae t c°, a c t °~, act°~)actC°merge: MERGE(merge)

^ V a c t * t : t = _ s

geLcar({ start(c,p,l)l car c, pump p, line 1} © t) _=

merge(accepted_cars(t,L1), accepted_cars(t, L2), accepted_cars(t,L3))

^ geCcar({ start(c,p,1) I car c, pump p, line 1} © s) =

merge(aecepted_cars(s,L1), aecepted_cars(s,L2), accepted cars(s,L3))

149

This means that after some finite amount of time not more cars could have been served than have arrived up

to this moment (first part of the formula) and in the end all cars have been served according to the FIFO

strategy (second part of the formula).

The predicate

pred(fct(aetCO,aetCO,act~)aet co) M E R G E

is defined according:

MERGE(f) := V ac t c0 x, y, z: 3 {1, 2, 3} c0 o:

f(x,y,z) = sched(x,y,z,o) A #(1 © O) = ,o A #(2 © O) = "0 A #(3 © O) = oo

where sched(x,y,z,ioo) =

if i = 1 A X ~ e then in'st(x) • sched(rest(x),y,z,o)

else i f i = 2 A y ~ e then first(y)- sched(x,rest(y),z,o)

else if i = 3 A Z ~ e then f'wst(z) . sched(x,y,rest(z),o)

else sched(x,y,z,o) fi

By this definition, every function f that satisfies MERGE is considered one instance of a merge

"function". Thus nondeterminism is not modelled by a set valued function, but by a set of functions (cf.

[Broy 88b]). Note that sched is not monotonic. This is not necessary in this context, because we do not

want to give a semantics to a network of stream processing functions, but just want to express properties of

streams.

The auxiliary function

f e t ( a c t c°, l i n e ) c a r ~ accep ted_ca r s

is defined according:

accepted cars(t, 1) = get_car({ accept(c,1) I car c} © t)

Reouirement 2:

"Cars in queue L1 can only be served at pump A, cars in queue L3 only at pump B, but cars in queue

L2 can be served at pump A or at pump B."

This could be done by one of the following equivalent versions:

V ear c: ~ start (c, A, L3) in s A ~ start (c, B, LI) in s

or by:

V ear c, p u m p p, l ine 1:

start(c, p, 1) in s ~ (p = A A 1= L1) v (p = B A1 =L3) v ((p = A v p = B ) A 1 =L2)

But this is not even necessary. In the algebraic specification of the type ACTION we have already excluded

these actions, so they are not possible anyway. We see that some (safety) requirements can even be

expressed by choosing an appropriate set of actions.

150

Requirement 4:

"At each pump only one car at a time can be served."

Up to now informally we always talked of a "service", which sounds like an "action". In contrast to the

actions corresponding to the sort act, a service has a certain duration. Therefore it has been modelled by

the two actions "start(c,p,1)" and "stop(c,p)", marking the beginning and the end of a service.

The connection between "starts" and "stops" is made by yet another requirement: A service consists of a

start (of this service) and exactly one stop (of this service). So the overall requirement is req2:

req2(s) =aa V act* t, p u m p p: t _c s

0 -< #({start (c,p, 1 ) l c a r c , l ine t}© 0 - #({stop (c, p) l ca r c}© t) _< 1

^ V act* t, ac t o t', l ine I: s = t . start(c, p, 1) ° t' ~ stop(c, p) in t'

The second conjunction clause has two aspects: after some service start of a car there is eventually the

corresponding service stop. But together with the fn'st conjunction clause it also implies that after some

service start of a car the corresponding service stop is required.

Reouirement 5:

"If a queue is full and a car arrives there, then it is rejected, otherwise it is accepted."

If you are analyzing this informal requirement in more detail, you may find several possibilities to interpret

it. This depends on the time when the response (i.e. an acknowledgement or a rejection) has to appear.

A reasonable, still informal interpretation of this would be: a response follows after at least 10 seconds.

There is no notion of time in our model though. So we have to find a substitute for this interpretation.

Possibility 1: The response happens immediately after a car arrives. As a consequence the pair (arrival,

response) must be considered an atomic action. This implies that concurrently to a response no start or stop

of a service may occur. Formally:

V act* t, l ine 1, c a r c: t . arrive( c, 1) g s

3 a c t * f : f _ = s

^ ( is_fuU(1,t) ~ t' = t ° arrive(c,1), reject(c,1) )

^ ( --, is_full(1,t) ~ t' = t . arrive(c,1), accept(c,1) )

This, however, is a very strict requirement and may not be implementable.

Possibility 2: A response by the system must happen before another car arrives at the same queue.

Disadvantage: Is this not an unrealistic modelling? The time when the system must react depends on the

length of the time intervals between arrivals.

V a c t * t, t', l ine 1:

151

one}

s = t - arrive(c~l) - t' ~ accept(c,1) in t' v reject(c,1) in t" )

{i.e. the response of the system will happen at all}

V act* t, l ine 1:

t -= s ~ 0 < #({arrive(c,1) I car c}, t)

#({ accept(c,1) I ear c }, 0

#({reject(c,1) ! ear c}, t) < 1

{i.e.: if a second car arrives at the same queue, then the reponse of the system

for the first car has taken place between the arrival of the ftrst car and the second

Possibilitv 3: There is no requirement on the time when the response of the system has to occur, but

arrivals within a queue get their responses according to the HFO discipline. Disadvantage: A second queue

may result before the gas station. Moreover a response may be as long delayed as there is a free place in the

queue again.

A

A

V act* t, t', l ine 1:

s = to arrive(c,l) o t' ~ accept(c,1) in t' v reject(c,1) in t' )

{i.e. the response of the system will happen at all }

V line 1:

get car({arrive(c,1) I ear c} © s) = get_car({accept(c,1),reject(c,1) I ear c} © s)

{Le. the responses come in the correct order}

( V act* t, ear c, l ine 1: (t* accept(c,1) =_ s ~ -,is_fuU(t,t))

^ (to reject(c,1) E s ~ is_full(t,1)))

{i.e. the correct responses happen}

Considering these three possibilities, none of them seems quite satisfactory. Our approach is to take the

most liberal of the three possibilities, number 3, as req3(s) and leave a further restriction to the design

phase.

Remark; Here the auxiliary predicate "is_full(t, 1)" is used.

It is defined as:

is_full (t, 1) = if 1 = L] then

#({accept(c, L1) I ear c} © t) - #({start(c,A,L1) I ear c} © t) = N

elseif 1 = L2 then

#({accept (c, L2) I ear c}, 0 -

#({start(c,A,L2), start(c,B,L2) I ear c} @ t) = N

elseif 1 = L3 then

#({accept(c, I-,3) l e a r c} © t) - #({start(c,B,L3) l ea r c} © t) = N

fi

(Please remember: N denotes the maximum length of the queues.)

152

At least at this point we see that statements about a "state" of the system cannot be made directly. We ftrst

have to "calculate" the state from the history of the system. Thus a state can be seen as a somewhat

condensed history.

3.3 Formalizing the Service Strategy

The formalization of the chosen service strategy 6) reads:

V act* t, l ine 1, 1', ca r c, Cl, c2:

t . start(c, p, 1) -= s

^ ((p = A ^ 1' =L1) v ( p = B ^I '=L3) )

^ Cl = nexLcar(r, t) A C2 = nexLcar(L2, t)

^ (lasLarrive(cl, r, t) < lasLarrive(c2, L2, t) ~ c = Cl ^ 1 = 1')

A (lasLarrive(ci, r , t) > lasLarrive(c2, L2, t) ~ c = c2 a 1 = L2)

where

let (car,line,act *)nat l a s ta r r ive

is defined by

lasLarrive(c, 1, t) = n ¢:~def

3 act* u, v: t = u - arrive(c, 1). v a ~ arrive(c, I) in v ^ restn(t) = v

and

fct( l ine,act*)car next_car

is defined by

nexLcar(1, 0 = c COder

geLcar({start(c,p,1) I car c, pump p} © 0 " c E get_car({accept(c,1) I car c} © t)

3.4 Splitting-up the Requirements: Towards a Design Specification

3.4.1 Why and How to Split-up

Up to now the requirements have been directed to the system as a whole. What systems do we take into

account? We consider systems that constitute a new or different automation in some organizational area.

This automation is a result of the cooperation of the software and/or hardware product and the environment

(i.e. the organizational area itself). Thus the requirements are directed both to the product and to the

enironment.

Therefore it is necessary to decide which entity (either the product or the environment) is responsible for

which part of the requirements. This splitting-up of the requirements has to be carried out according the

paradigm of "rely and guarantee". This notion was first mentioned in [Jones 83]. We only use Jones'

general idea about that topic, not his theory, because he has a shared variable approach. The idea is: if the

153

environment sticks to the rules (its part of the requirements), also the product (software and/or hardware

system) sticks to the rules, and vice versa.

In this way the requirements are structured. Having done this for two entities (product, environment), is it

reasonable to further split up these entities?

To answer this question you may think of the many cases where beside the decomposition into the product

and the environment other decompositions appear natural:

Examoles:

1) The underlying hardware is distributed. For example think of a computer network and of the software

that should be produced for this distributed structure, ff the computers of this network are identical, a

decomposition of the requirements is not absolutely necessary at this moment. It may take place during the

design specification.

However, if the computers are only well-suited for a particular task, a further decomposition is reasonable.

This may be so in the ease of a network composed of high performance number crunchers,

microcomputers, database machines, etc.

The same applies to a lower level (of. [Bode, Hiindler 83]): In Qn¢ computer there may be several

processors (multiprocessor). These processors can be specialized (e.g. a co-processor for floating point

calculations) or not (homogenous mtdtiprocessor).

Also on the level of the processing elements there may be a replication of these. Again these may be

identical (array processors: parallel operation in lock step manner) or specialized (pipeline computers).

Even on the level of rnicroprogramming pipelining is frequently used.

2) The ~plieation is distributed.

This means that there is an inherent parallelism in an application. For example, we may think of three units

of a large organisation where each of these units can nearly independently process large amounts of data.

From time to time, however, some communication and cooperation becomes necessary, for example an

access to remote data. A distributed database seems to be a reasonable choice.

Of course this also implies a distributed hardware. In contrast to the first example these systems are mostly

loosely coupled. Moreover the purpose here is not to run nearly arbitrary programs on a given hardware,

but we look for a distributed hardware and/or software system for a small class of applications only,

With respect to gas station example this means:

1. All requirements will be split up into (conditional) requirements on the (hardware and/or software)

system and on the environment (conditional with respect to the "rely and guarantee "-paradigm).

2. Furthermore the system can be decomposed, e.g. into a controllerl, which supervises the access to the

three queues, and a controUer2 assigning the cars from the queues to the pumps.

These two steps can be carried out sequently (i.e. step 2 after step 1) or in parallel.

154

3.4.2 The Process of Decomposition in Detail

Any functional unit is only responsible for the actions which it actually can carry ouL (Example: The arrival

of a car cannot be influenced by some controller of the gas station.)

In the first sten the set of actions has to be partitioned. The parts have to correspond to the domain of

responsibility of the respective unit. It might happen that in the first version of the requirement specification

(without decomposition) an action which requires participation of several units has been considered

observable. Such actions have to be split up into several actions, each one being in the domain of

responsibility of exactly one unit.

In the second step requirements have to be formulated both for the system (e.g. the controller) and the

environment, each one according the "rely and guarantee" paradigm.

In the case of the gas station specification this means:

Considering the set of actions there exists a clear separation:

- The environment is responsible for "arrive"- and "stop"-actions. This is because only the environment

"decides" when a car does arrive. With regard to "stop", let us interpret that it is the customer who has to

decide when he has got enough fuel. So the "stop" of a service depends only on this decision.

- The system is responsible for "accept", "reject" and "start". These are the messages which make up the

automation of the gas station.

So in this case no further splitting of actions is necessary, thus the first step (see above) can be dropped.

The system "gas station" can be decomposed into three parts:

L1 \ .. . . .

L3 /

These parts interact in the following way:

.\acc. t, o, x' />

Figure 3

Figure 4

155

1) C1: This is the access controller. It registers which cars have arrived at the gas station. Moreover it

knows the contents of the three queues and using this information it decides whether a car can be accepted

or must be rejected.

2) C2: pump controller. It gets a message from the environment when a pump becomes idle. So it always

knows, i ra service at one of the pumps is possible and gives the "next" car permission to use the pump.

This controller need not know the exact state of the queues. It suffices (at least for the strategies mentioned)

for this controller to recognize whether a queue is empty or not.

3) env: environment. It "decides" when cars do arrive and when after a start of a service (this is a "rely",

i.e. the start of the service must have been established by the system) a stop of a service happens. The

environment also receives the responses of the system (accept / reject).

As we see, with this decomposition of requirements we have already taken a step towards a design

specification where the whole system is described as a net of stream-processing functions. In contrast to

the design specification, in this section we have only described precisely the requirements on the individual

parts of the whole system, not in which way they have to be fulfilled.

For these individual parts here it is not yet necessary to determine by which part the "rely"-actions are

performed. It is not of any importance which part gets the results of the actions as well, the actions just

have to happen. This interaction will not be considered until we enter the design specification phase.

Examlale:

On the level of the requirement specification the following part from the picture above

start

accept v

Figure 5

is equivalent to:

stop ~ ~ - ]

accept start v

Figure 6

That is to say: "If at certain times accepts and arrives happen, at certain (other) times starts have to happen."

Therefore in the case of the gas station specification it is also possible to regard C1 and env together as a

more complex environment env' modelling only the "external behaviour" relevant to C2. This

156

corresponds to a system where now and then the queue gets new cars, but it is not described how this

actually happens. Thus the process of arrival is no longer described.

Requirements on CI:

This is only the requirement 5). Considering the corresponding formulas we note that the form is in each of

the three cases:

If certain requirements on a prefix of a permitted behaviour w.r.t, arrive-actions are fulfilled, then certain

requirements on accept- and reject-actions must be ensured.

(Moreover these requirements depend on a predicate is_full (constituting another "rely"), which is a

shorthand for a formula expressing requirements on accept- and previous start-actions.)

Requirements on C2:

These are the requirements:

1) and 3): If certain requirements on a (permitted) behaviour w.r.t, accept-actions are fullfilled, ~hcn certain

requirements on the behaviour w.r.t, start-actions must be ensured.

4) At first sight you could consider this a requirement on the environment. However, the meaning is: Only

thgn a start-action at a certain pump must happen, ~ a stop-action has happened before.

Requirement on env:

This is a part of requirement 4): After a start of a service, exactly one stop must happen.

The splitting of requirements has not yet been much investigated. We would have to prove that the rely

condition is implied by the guarantee conditions of the other modules.

Moreover, the "rely and guarantee" paradigm is used here only in an informal way as a guideline to make

the transition from the requirement specification to the design specification easier.

4. Design Specification

4.1 Design with One Control Module

In this section the whole system is modelled by just two modules (some people call them also agents): a

controller and the environment. The requirement specification leaves it open whether to describe the system

by two modules or more. The two module case is included to demonstrate by a simpler example the general

description technique used in design specifications Similarly the split-up requirements of section 3.4.2 will

be used to make a design with three modules (environment, access controller, pump controller) in section

4.2.

With two modules we have a simple reactive system: The controller gets requests (arrive-actions) from the

environment env and reacts to them with responses (accept or rejec0. Moreover it starts the services

(start). The environment receives these signals and reacts to them correspondingly.

157

Thus we get a feedback loop:

[arrive]

I controller

s t r e a m 11 { accept, reject, start

I [ stop ]

env I

s t r e a m

{ arrive, stop }

Figure 7

In Fig. 7 the position of the actions in square brackets denote where they come from: stop-actions depend

on the input actions of env, whereas arrive-actions do not. They are independent of what the controller

does, thus they could be thought of springing from some source outside the system. Of course, this

indicates the possibility of decomposing the environment into two parts.

Starts are causal for stops, arrive-actions may be put into the stream of input actions for the controller at

arbitrary times.

Two new sorts are introduced:

- sort eaet: These are the environment actions (arrive, stop).

- sort eaet: These axe the controller actions (accept, reject, start).

Deterministic modules are modelled in our formalism by stream processing functions, which are

continuous functions taking tuples of input streams and producing one output stream. The controller turns

out to be deterministic.

The module env acts nondeterministically, as it may issue arrive-actions at arbitrary times. In our

formalism nondeterministie modules are modelled by sets of deterministic stream processing functions.

Every function of a set corresponds to one particular behavior of the module. Another way of modelling

nondeterminism, using relations instead of functions, leads to a intricate semantic problem (the merge-

anomaly, el. [Brock, Ackerman 81]).

Assume we have to functions

fct (eactO))cactO~controller,

fct(cactCO)eactO~env

158

describing one behavior of the controller and the environment respectively. Then we are looking for the

least fixpoints of the following equations:

env(controller(x)) = x

^ controller(env(y)) = y.

The connection to the requirement specification is made by being able to show for the least fixpoint x that:

trace(controller, x, 0 =~ gas_station(t) (*)

The left of the arrow is the subject of the design specification, whereas the right of the arrow is given by

the requirement specification.

"trace" can be defined:

r e l ( f c t ( e a c t co) cac t co, e a c t o~, act o~) trace

trace(f,x,t) ¢:~ x = (eact © t ) ^ fix) = (cact © t ) ^

V act°: r _= t ~ (caet © r) =_ f(eact © r)

(*) is the goal that has to be established. There is no easy way to reach this goal yet. Proofs are often

difficult (cf. [Broy, Strelcher 87], [Broy 88a]), they could be referred to as "ingenuous correctness proofs"

[Loeckx, Sieber 84] as opposed to a verification calculus. Still we think requirement specification and

design specification are close, because both are based on the formalism of streams.

The behavior of the controller and the environment can be modelled with the help of auxiliary functions

which do not only depend on the input stream, but also on a s t a t e . The state concept is introduced as a

means to make specifying stream processing functions easier. The state comprises the condensed input

history up to some moment.

Here we have:

controller(s) = he(s, emptyqs, { }, 0),

env(s) = he(s, ec, ec).

The auxiliary functions hc and he have the functionalities:

f c t ( e a e t ~ , q u e u e s , p u m p s e t , n a t ) c a c t co hc,

fc t ( c a c t CO,ee l l , ee l l )eact ~ he .

We refer to algebraic specifications describing queues and pumpset.

pumpset simply is the sort that characterizes all subsets of the set {A,B }. The elements of ps (12ump set)

denote which pumps are currently used.

159

The variable qs (~ueue~) of sort queues describes the current state of the three queues corresponding to

line L1, L2 and L3. Trivially queues can be seen as the cartesian product of queues corresponding to

the lines. We use projection functions queue(1,qs) to give the queue corresponding to line 1. The usual

operations on queues are assumed to be given (first, rest, etc.), mk_queues(1,q,qs) gives a new

configuration of the queues where we take queue q as the new contents of the queue corresponding to 1.

The queues used here consist of pairs of cars and natural numbers, the natural number giving a tirnestamp

for the arrival of the car. This timestamp is simply a sequence number. It is also the last parameter of the

function hc. This number is updated every time a new car is accepted and enters a queue.

We omit the algebraic specifications of the types corresponding to queues and pumpse t for brevity

reasons.

The sort cell springs from the type CELL, which is similar to the standard type QUEUE:

type CELL =

sort ceil,

fct cell ec, {empty cell)

fet(celi)bool is_empty,

let(ear)cell put,

let(cell)car cont, {contents}

is_empty(ec) = true,

cont(put(c)) = c,

is_empty(put(c)) = false

end of type

The auxiliary function hc is specified:

hc(arrive(c,1) - t, qs, ps,nr) =

if A ~ ps ^ (1 = L1 v 1 = L2) then

accept(c,1), start(c,A,1) • hc(t,qs,ps u {A}, nr+l)

else if B ~ ps ^ (1 = L2 v 1 = L3) then

accept(c,1), start(c,B,1) • hc(t,qs,ps u {B }, nr+l)

else if is_full(pump(l)) then reject(c,1) - hc(t,qs,ps,nr)

else accept(e,1), hc(t,mk_queues(1,append((c,nr), queue(1,qs)),ps,nr+l),

(p = A A1 =L1) v (p = B A I = L 3 ) ~

(~is_empty(queue(1,qs)) A ~is__empty(queue(L2,qs))

A first(queue(1,qs)) = (cl ,nl) A f'lrst(queue(L2,qs)) = (c2,n2)

((nl > n2 =# hc(stop(c,p) - t, qs, ps,nr) =

start(cl,p,1) ° hc(t,mk_queues(1,rest(queue(1,qs),qs),ps,nr))

A (nl < n2 ~ hc(stop(c,p), t, qs, ps,nr) =

start(c2,p,1).

hc(t,mk_queues(L2,rest(queue(L2,qs),qs),ps,nr)))

160

A (--~is_empty(queue(l,qs)) A is_empty(queue(L2,qs))

hc(stop(c,p) • t, qs. ps,nr) =

start(first(queue(1,qs)),p,1)-

hc(t,mk_queues(1,rest(queue(1,qs),qs),ps,nr)))

A (is_empty(queuefl,qs)) A --,is_empty(queue(L2,qs))

hc(stop(c,p) ~ t, qs, ps,rtr) =

start(first(queue(L2,qs)),p,1)-

hc(t ,mk_queues(L2,mk_queues(L2,rest(queue(L2,qs)),qs),ps,nr)))

A (is_empty(queue(1,qs)) A is_empty(queue(L2,qs))

hc(stop(c,p) ~ t, qs. ps,nr) =

hc(t ,qs,ps- { p },nr))

ENV describes the auxiliary functions modelling the enironment:

ENV(f ) :¢=~ 3 f c t ( cac tCO,ce l l , c e l l ) eac t t ° f : ENV(f ' )

A V c a c t ~ t, ce l l cA, cB, c a r c, l ine 1:

(( f(e,cA,cB) = e,

A f(accept(c,1), t ,cA,cB) = f ( t ,cA,cB)

A f(reject(c, I ) . t, cA,cB) = f ( t ,cA,cB)

A is_empty(cA) = tree :=~

f(start(c,A,1) • t ,cA,cB) = f( t , put(c), cB)

A i s_empty(cB)= true

f(start(c,B,1) - t ,cA,cB) = f(t , cA, put(c)) )

v is_empty(cA) = false

f(t ,cA,cB) = stop(cont(cA), A ) . f ( t ,ec ,cB)

v is_empty(cB) = false

f(t ,cA,cB) = stop(cont(cB), B ) . f '( t ,cA,ec) )

v f(t ,cA,cB) = arrive(c,1) ° f ( t ,cA,cB) )

161

4.2 Design with Two Control Modules

In the same way as shown in the last section we can also refer to the split-up requirements (cf. section 3.4)

and give a design with three modules: two control modules (C1 and C2) and the environment (env).

~ - ] .,~ start

. ~ a ~ p t e c ~ ~ ~

Figure 8

In contrast to the last section, not all of the output of one module will be the input to each of the other

modules. The input of one module does not necessarily consist of all of the outputs of the other modules as

well. For example, the input to C1 is only a stream of start- and arrive-actions, whereas its output are two

streams, one of accept- and reject-actions and one of accept-actions only.

We observe that the module C1 would have to merge its input streams and produce two output streams

(this could be modelled by two functions CI' and CI" orby one function giving a tuple of streams as

output).

This, however, can also be expressed uniformly by a network of functions in the following way: the input

streams are merged to one input stream. (This technique is known from data transmission as multiplexing). The function produces exactly one output stream, which is then split-up into the number of output streams

needed (demultiplexing). These output streams are then filtered. The result is a network of stream

processing functions, which often are nondeterministic. (This idea of networks and typed channels goes

back to [Kahn 74], there however only deterministic functions are used).

The following picture shows how this works in general. We use the symbols know from dataflow notation.

162

W

/k ~ " 7 " - ~

I I ter,% I

Figure 9

Thus we get the following three modules, which can be decomposed in the way described above:

a) s tar t accept ,._ arrive ~j Cl i accept, reje~

Figure 10

C1 is the access controller. Its state is the contents of the three queues.

b) stop ~., accept ~ C2 ] start

Figure 11

C2 is the pump controller. Its state is the pump contents and the contents of the three queues, every entry

having a time stamp.

c)

accept, rejeck, arrive

i " start ~] env ............. stop ,,.

env is the environment. Its state is the contents of the pumps.

Figure 12

163

These three modules are specified by the following functions (Notation: {start, arrive} is the sort which

comprises all permitted actions "start(c,p3)" and "arrive(c,1)", etc.):

fc t ({s ta r t , arrive}O~){aceept, reject} °) C1

is defined with the help of function hel by:

Cl(t) = hcl(t,emptyqs').

fc t ({s top , aecept}°~){star t} c° C2

is defined with the help of function he2 by:

C2(t) = hc2(t,emptyqs,{ },0).

The function hcl has a parameter of sort queues ' . It is very similar to queues, the only difference

being that the elements of queues ' do not have a timestamp. We can interpret this that only the pump

controller C1 needs this information to fulfil its serving strategy (cf. section 3.3).

fc t ({s ta r t , arr ive}CO,queues ' ){accept , reject} o) hcl

is defined by:

hcl(arrive(c,1) o t, qs) =

if is_fuU(1,qs) then reject(c,1) - hcl(t,qs)

e l se accept(e,1) - hcl(t,mk_queue(1,append(c,queue(1,qs),qs)) fi,

(p = A ^ (1 = L1 v 1 = L2)) v (p = B ^ (1 = L2 v 1 = L3)) ^ ~is_empty(1,qs)

hcl(start(first(queue(1,qs),p,1), t,qs) =

hc l(t,mk_queues(1,rest(queue(1,qs)),qs),

fct({stop, a e c e p t}C° ,queues ,pumpse t , na t ) { s t a r t } co hc2

is defined by:

hc2(accept(c,1) o t, qs, ps,nr) =

if A ~ ps ^ (1 = L1 v I = L2) then

start(c,A,1) - hc2(t,qs,ps u {A}, rtr+l)

else if B ~ ps ^ (1 = L2 v 1 = L3) then

start(c,B,1) o hc2(t,qs,ps u {B}, nr+l)

else if is_full(pump(l)) then he2(t,qs,ps,nr)

else hc2(t,mk_queues(1,append((c,nr), queue(1,qs)),ps,nr+l),

164

(p = A ^ 1 =L1) v (p = B ^ l = L 3 ) =~

(-ds_empty(queue(1,qs)) ^ ~is_empty(queue(L2,qs))

^ fwst(queue(1,qs)) = (cl ,nl) ^ first(queue(L2,qs)) = (c2,n2)

=~ ((nl -> n2 ~ hc2(stop(c,p) • t, qs, ps,nr) =

start(el,p,1) • hc2(t,mk_queues(1,rest (queue(l,qs),qs),ps,ur))

^ (nl <n2 ~ hc2(stop(c,p) • t, qs.ps,nr) =

start(c2,p,1).

hc2( t,mk_queues(L 2,rest( queue(L 2,qs ),qs ),ps,ur ) ) )

^ (-ds_empty(queue(1,qs)) A is_empty(queue(L2,qs))

hc2(stop(c,p) , t, qs, ps,nr) =

start(first(queue(1,qs)),p,1).

hc2(t,mk queues(1,rest(queue(1,qs),qs),ps,nr)))

^ (is__empty(queue(1,qs)) ^ -,is_empty(queue(L2,qs))

=~ hc2(stop(c,p) • t, qs, ps,ur) =

start(first(queue(L2,qs)),p,1).

hc2 ( t,mk_queues(L 2,mk_queues(L2,rest( queue(L 2,qs ) ),qs ),ps,ur ) ) )

^ (is_empty(queue(l,qs)) ^ is_empty(queue(L2,qs))

hc2(stop(c,p) , t, qs, ps,nr) =

hc2(t,qs,ps-{p},nr))

The predicate ENV describing the environment is the same as in the last section.

The proof obligation to relate the design specification to the requirement specification is similar to that of

section 4.1, but more complex, because more streams have to be considered.

5. Conclusion

In this paper I concentrated on the application of the first two phases of the design methodology by

Manfred Broy on a concrete example. The questions asked in section 1 can now be answered:

1. One may argue that the formulas presented here are difficult to read. But it must be objected that

distributed systems are a complex area. Thinking in the formalism of streams and logical formulas is new

to most programmers, so learning this new formalism is of course not easy for them. Some acquaintance is

necessary to use the methodology effectively.

The readability, however, could be improved by useful macros for expressing typical and recurrent

situations are needed. (Some of them have been presented in section 3.1) Thus the language used here can

be regarded as a kernel language for specification. A pragmatic extension of it with shorthands will prove

useful.

165

2. During the whole specification process a finer structuring is reasonable and helpful. An example is not to

go directly from the requirements to the design, but, if necessary, to ftrst split up the requirements. Ways

of structuring a specification are:

a) the rely- and guarantee-decomposition,

b) the safety- and liveness-decomposition.

These decompositions are orthogonal. As yet this subject has only been touched on, further investigations

are necessary.

3. Though the transition from one phase to the next can be formally verified (examples can be found in

[Broy 88a]), there is still no concept to guide the proofs. What is needed is a calculus and further

methodological means to overcome this difficulty. I think of something like the Hoare calculus for the

verification of sequential programs. The decompostions mentioned under 2 may prove helpful.

The most difficult steps are those from the informal to the formal requirements and from the requirement

specification to the design specification. Most naturally, the fast one is just the gap between informality and

formality.

We also notice that the concept of a feedback loop as demonstrated for two units (controller, environmen0

generalizes most naturally to an arbitrary number of units. Feedback, or more generally, communication

takes places between these units. Considering the environment as (at least) one of these units allows for a

uniform treatment of the whole system.

Not all of the ideas presented here are new. For example in the ERAE-approach of requirements

engineering [Dubois et al. 88] the splitting up of the requirements and the integration of the environment

appears as well, though there temporal logic serves as a basis. The difference lies in a methodology for all

phases of specification and design. The transition from one phase to the next can be assisted by formal

means. A verification that the transition is correct is possible.

Acknowledgement I would like to thank Manfred Broy and Martin Wirsing for their encouragements and helpful suggestions.

References

[Bode, H~dler 83]

A. Bode, W. I-I~dler: Rechnerarchitektur II. Berlin-Heidelberg-New York-Tokyo, Springer, 1983

[Brock, Ackerman 81]

J.D. Brock, W.B. Ackerman: Scenarios: A Model of Non-deterministic Computation. In: J. Diay, I.

Ramos (Eds.): Foundations of Programming Concepts, Intern. Coll. Peniscola, Spain, 1981, Springer

Lecture Notes in Computer Science 107, pp. 252-259

166

[Broy, Streicher 87]

M. Broy, T. Streicher: Specification and Design of Shared Resource Arbitration. MIP-8721, Universitiit

Passau, 1987

[Broy 88a]

M. Broy: An Example for the Design of Distributed Systems in a Formal Setting: The Lift Problem, MIP-

8802, Universitiit Passau, 1988

[Broy 88b] M. Broy: Towards a Design Methodology for Distributed Systems. In: M. Broy (Ed.): Constructive

Methods in Computing Science. Berlin-Heidelberg-New York-Tokyo, Springer, 1989, pp. 331-364

[Dubois et al. 88]

E. Dubois, J. Hagelstein, A. Rifaut: Formal Requirements Engineering with ERAE. draft

version, to appear in the Philips Journal of Research, October 1988

[Jones 83] C.B. Jones: Tentative Steps Toward a Development Method for Interfering Programs. ACM Transactions

on Programming Languages and Systems, Vol. 5., No. 4, October 1983, pp. 596-619

[Kahn 74]

G. Kahn: The Semantics of a Simple Language for Parallel Programming. Information Processing 74,

North-Holland Publishing Company, 1974

[Krtger 87]

F. Krtger: Abstract modules: Combining algebraic and temporal logic specification means. Technique Et

Science Informatique, Vol. 6, No. 6, 1987, pp. 559-573

[Lamport 89] L. Lamport: A Simple Approach to Specifying Concurrent Systems. CACM, Vol. 32, No. 1, January

1989, pp. 32-45

[Loeckx, Sieber 84]

J. Loeckx, K. Sieber: The Foundations of Program Verification. Wiley-Teubner, 1984

[Wirsing et al. 83]

M. Wirsing, P. Pepper, H. Partsch, W. Dosch, M. Broy: On Hierarchies of Abstract Data Types. Acta

Informatica 20, 1983, pp. 1-33

Transformations of Designs

L.M.G. Feijs * Philips Research Laboratories Eindhoven

P.O. Box 80000, 5600 JA Eindhoven, The Netherlands

Abstract

In this paper we present a theory of correctness-preserving transformations of designs. The paper gives an informal introduction to both the structuring concept of a "design" and to certain dynamic aspects of the software development process. There is a focus on combining designs, strategies for growing designs and re-adapting them to external-context modifications. Although the notion of a design is part of the language COLD, the presentation in this paper is given independent of that, in a general setting.

1 I n t r o d u c t i o n a n d M o t i v a t i o n

This paper is about the component-wise construction and specification of complex systems, addressing issues of modularisation, abstraction and information hiding. Formal specification techniques have received much attention during the past decade and many specification formalisms have been proposed, ranging from algebraic specification languages (see [1] for an overview) to wide-spectrum languages (such as VDM [2],[3], Z [4] and COLD [5]). The underlying philosophy behind these languages, although sometimes implicit, is that software systems are becoming increasingly complex and that therefore it is necessary to have (formal) specifications of parts of a system, where these specifications are optimal with respect to understandability, rather than efficiency.

The basic idea underlying our formal notion o f a ~'design" is an immediate consequence of this philosophy. In our approach a design is a hierarchically structured and component-wise specified software system. This is not the same as a hierarchically structured system denoted in e.g. a specification language based on an algebraic approach to module composition [6]. The essential point is that in a design, each component can have two descriptions associated with it. Furthermore each component can have a name and each such name can be used to refer to the corresponding component. A design contains a number of these components.

*the work reported in this paper has been performed in ESPRIT project 432: Meteor.

168

There are non-trlvial issues of information hiding that arise in connection with these designs. This leads us to definitions of black-box correctness (based on the exclusive use of specifications) and glass-box correctness (using implementation knowledge). As it turns out, the ~r-calculus [8] can be used to give an interpretation of these designs and reduction relations can be used to give alternative characteris~tions of black-box correctness and glass-box correctness. The theory of designs has been worked out formally in [7] and [8]. This formal notion of a design is available in the language COLD- K [5]. There are also extensions of ASL [9] where it is possible to have components and in fact it is possible to add designs on top of any specification formalism which is based on an algebraic approach to module composition [6].

A 'design' is nothing but a static structure and from a methodological point of view it is important to study the dynamic aspects of the software development process as well. These dynamic aspects are the main topic of this paper. In particular we shall study some mechanisms for combining designs, strategies for growing designs and for re-adapting them to external-context modifications. Although the work presented in this paper has been performed with direct application to COLD in mind, the results are quite independent of COLD and they will be presented as such.

The structure of this paper is as follows. In section 2 we give a brief introduction to the theory of designs. In section 3 we show an example of a development process where we construct, somewhat artificially, a simple design in a top-down manner. The example is taken from the area of electronic digital hardware [11]. The remainder of the paper focusses more on the development process and we shall have a look at the following topics: modifying components, combining designs, design evolution, design creation and design partition. We shall treat these topics with a focus on the definitions and the associated intuition. For a much more formal treatment we refer to [10]. The proofs of the propositions stated in the current paper can also be found in [8] and [10]. For a large and realistic example we refer to [12] and [13].

2 Designs: definitions and properties

In this section we give a summary of the theory of designs. We adopt an algebraic approach to module composition, e.g. Bergstra's module algebra [6] or Jonkers' class algebra (CA) [5]. Let us summarise the basic idea of this algebraic approach to module composition first. Modules are (algebraic) terms built up from basic modules by means of module composition mechanisms which take the shape of algebraic operators. Although the underlying algebra can be many-sorted (with modules, signatures, renamings, etc. as sorts), we must assume that there is one sort of interest which is the sort of modules. The other sorts are called 'secondary sorts' and in order to keep things simple~ we shall not discuss the secondary sorts any further in this paper.

The basic modules take the shape of a list of definitions (functions, predicates, axioms, etc), usually enclosed in keywords, (e.g. CLASS and END in COLD-K). Furthermore there are one or more operators and the most important example of these is the binary

169

import-operator. It is used to fit two modules together to constitute a larger module and usually it takes the shape of a mixfix operator IMPORT . . . INT0 . . . .

Furthermore we must adopt an implementation relation, denoted by C which is an important binary relation on modules. In particular, when M1 and M2 are two modules, such that when viewing M1 as a specification and M2 as an implementation module, then we write M2 c M1 if M2 is an implementation of M1. The precise definition of this relation may be non-trivial, but we adopt at least the following two properties which are accepted on a-priori and intuitive reasons: (1) reflexivity and (2) transitivity (see e.g. [14]). In particular (1) says that each module implements itself and (2) says that if M1 implements M2 and M2 implements M3, then M1 should implement Mz. Furthermore we shall adopt the rule that if MI _C M2 and M2 _E MI, then M1 = M2. We also assume that some of the module-composition operators are monotonic and that for those operators the monotonicity can be used when reasoning about modules. In particular, when import happens to be a monotonic operator, then we conclude from M1 E M2 that (IMPORT M1 INT0 Ms) __ (IMPORT M2 INT0 Mz).

For the theory of designs we need hardly be interested in the precise details of the underlying a lgebr ic approach to module composition. It is sufficient to know that there are terms, built from constants and a set of operators and that there is a partial order C_. With respect to the terms denoting modules, we add the option to have variables ranging over modules. Thus x may be used as a term denoting a yet unknown module. When reasoning about these terms we adopt a number of reasoning rules, viz. reflexivity of c , transitivity of __, antisymmetry of __, substitutivity for = and monotonicity for certain operators. Furthermore we assume that a suffient number of facts with respect to E_ about basic modules are given. We could extend this to a full laznbda-calculus, constructing parameterised modules as done in [7], but for the current presentation we do not need that. By the above reasoning rules, a software developer can reason also about the yet unknown modules. For example, when it is given that z _C S, i.e. a yet unknown module z has S as its specification, then it may be concluded that (IMPORT x INT0 M) E_ (IMPORT S INT0 M), provided import is monotonic. We shall use the symbol 't-' to denote this notion of 'reasoning with assumptions'. At the left-hand side of the ~-' we write the assumptions, put in square brackets, and at the right-hand side of the 't-' we put the conclusion. If we put the above example in this notation, then we get

[x _C~ }- (IMPORT x INT0 M) ___ (IMPORT S INTO M)

This concludes our discussion of the algebraic approach to module composition and the reasoning rules used by developers manipulating modules. Now let us define the notion of a (software) component.

Def in i t ion 2.1 Formally, a component is a triple (x, P, Q) where x is a (logical) variable and P mad Q are algebraic terms, possibly containing variables. The intuition is that x serves as the name of the component, P as its implementation and Q as its specification (its black-boz description). Because sometimes there is only a specification, we

170

also allow the dummy term pr im for P. We adopt a concrete syntax for components, writing x := P _E Q for the triple (x, P, Q). Now a design, is nothing but a list of components, followed by the keyword sys t em and a list of zero or more additional terms - the 'system'. []

The system indicates the actual terms (=modules) that are to be viewed as the product to be delivered to the customer. In most cases the system itself will consist of a few very simple terms containing component names. Thus a design looks as follows:

:~ := Pl E Qi z2 := P: E Q2

. . . x~ := P,~ E Q,~ sy s t em [ S I , - - - , Sn ]

Designs are somewhat similar to Automath "books" [15] where the components correspond to Automath "lines". However, there is no such thing as a "system" in Automath books.

Def ini t ion 2.2 A design d is well-formed (abbreviated as w]) if all components of d have distinct names and no variable occurring in d is used before it has been introduced as the name of a component. []

We shall define two notions of correctness for designs, which we shall call glass-box correctness and black-box correctness. Both forms of correctness are interesting from a methodological point of view.

Def in i t ion 2.3 A context is a set of assumptions. A component x := P E Q with P not equal to p r im is said to be correct in context F if F ~- P E Q- A component of the form x := p r im _ Q is simply correct in any context. []

As an example of this definition we could consider the component y := (IMPORT x INT0 M) __ (IMPORT S INT0 M). Let us for the remainder of this paper assume that import is monotonic. Then this component is correct in context

[xE s]. This defines what it means that a component is correct in a given context and now we want to define correctness for wf designs. Roughly speaking, we shall define this in such a way that a design is correct if each of its components is correct. In order to make this idea precise we must be explicit about the contexts in which the correctness of the components is to be derived. There are two reasonable possibilities for defining these contexts. Therefore we shall have two notions of correctness for designs.

The first notion of correctness corresponds to the possibility that there is no information hiding; if the developer reasons about a name xk for which there is a component (x~, Pk, Qk) in the design, then the developer may use the fact that xk stands for P~. If Pk ~ prim, then the only assumption he can make about xk is [xk E Qk]. We shall

171

call this "glass-box correctness".

The second notion of correctness corresponds to the possibility that the implementations are hidden: if the developer reasons about a name zk for which there is a component (xk, Pk, Qk) in the design, then he may only use the fact that the term for which x~ stands is specified by Qk, i.e. he may use the assumption [x~ E Qk]. We shall call this "black-box correctness".

Def in i t ion 2.4 Let the design d be given as

xl := P1 E Q1 . . ,

xn := /on s y s t e m S.

_

Assume that d is wf. We say that d is glass-box correct (abbreviated gbc) if for each component (z j, Pj, Q j) in d where Pj ~ p r im we have

rj Ps E Qs

where I'j = ~1 , . . - , ~i-1 and for 1 < k < j - 1 the assumptions Tk are defined by

(i) ~ok = [xk = Pk] (Pk ~ pr im) ,

(ii) ~k = [x~ K Qk] (Pk - pr im).

We say that d is black-box correct (abbreviated bbc) if for each component (xs, Pj, Qj) in d where P~ ~ p r im we have

rs Pj E_ Qs

where Fj = ~1 , . . . , Tj-1 and for 1 < k < j - 1 the assumptions Tk are defined by

= [zk E

i.e. if we have [al E Q1], . . . , [as-1 E Qs-1] ~- Pj --- Qs" []

We defined two notions of correctness, viz. gbc and bbc. Bbc is stronger than gbc, since the facts Pj ___ Qj to be derived are the same for gbc and bbc, but for bbc these facts should be derived with less knowledge than for gbc.

P r o p o s i t i o n 2.5 For a wf design d we have

d i s b b c =~ d i sgbc . []

It is possible to translate each design d into a lambda term. The lambda term resulting from this translation can be viewed as the meaning of d and we shall denote it as [d]. For each component of d for which the implementation is not pr im, there is an

172

abstraction-application pair in [d]. For each component of d for which the implementation is pr im, there is an abstraction in [d]. This technique of using abstraction- application pairs and abstractions to describe the role of names has been proposed by de Bruijn (in the context of Automath). We refer to [7] for the details.

From now on we restrict ourselves to well-formed designs. We make two simplifying assumptions about the designs we consider. The first assumption is that in each design all p r i m components come before the non-prim components. This is a kind of standard form and it can be shown that under certain conditions this is no real restriction [10]. As a second simplifying assumption we adopt the convention that component names occur only in implementations and not in black-box descriptions. From a practical point of view this is not a real restriction.

3 Top-down Example

Before embarking on a systematic study of correctness-preserving transformations of designs (including top-down and bottom-up developments), we first give an elaborate example of a top-down development which is taken from the area of electronic digital hardware [11]. We make no claim whatsoever that the hardware circuit of the example is efficient or fast. In fact our main interest is not in hardware design at all, and the purpose of the example is to illustrate the notion of design and some of the transformation steps operating on designs. Of course we could take examples using the class-algebra CA of COLD-K, but for illustrating the design concept it is beneficial to have a kind of 'stand-alone' example as well.

The example is about logical circuits and the composition mechanisms for these are interconnection wirings. We have two kinds of descriptions for logical circuits, viz. equations and interconnection diagrams. The equations are based on two-valued Boolean logic with constants 0,1, multiplication (= and) such that 0.0 = 0.1 = 1.0 = 0,1.1 = 1, addition (= or) such that 0 + 0 = 0,1 + 0 = 0 + 1 = 1 + 1 = 1, inverting (= not) such that ~ = 1 and ]" = 0. We write x @ y for the exclusive or x.y + ~.y. Logical circuits have ports which act as a kind of variable (a, b, c, etc) and which are grouped into two categories, viz. input ports and output ports. An equation is always written with the output ports occurring at the left-hand side and the input ports at the right-hand side of the equation. For example, z = (alb) is an equation which specifies a 'nand' logical circuit with input ports a, b and output port z.

The interconnection diagrams represent algebraic terms corresponding to some algebraic approach to logical-circuit composition. We do not provide a formalisation of these representation issues - although of course this could be done. The interconnection diagrams may contain component names and we assume that this is also the case for the terms represented by the interconnection diagrams. Intuitively, the interconnection diagrams will speak for themselves.

We shall use numbers such as 7400, 7404, . . . to act as component names. The use of these names is consistent with the terminology of the well-known transistor transistor

173

logic (TTL) family of integrated logical circuits [11].

For the sake of the example, let us assume that it is our task to develop a four-bit adder. We need some notation to specify the four-bit adder. For a bit-sequence we write tat(b) to denote the integer which is binary represented by b. In particular, int(0, 0, 0, 0) = 0, int(0, 0, 0, 1) = 1, int(0, 0, 1,0) = 2 and int(0, 0,1,1) = 3. A four-bit adder is a logical circuit with input ports as, a2, al, a0, bs, b2, bl, b0, output ports 84, 83, s~, 81, so and which is specified by the equation int(s4, s3, 82, 81, so) = int(as, a~, al, ao) + int(bs, b~, bl, bo) which we abbreviate as in t (~ = int(ff) + int(b). The component name of the four-bit adder is 74283. Therefore we can already put the information which is relevant for the system user of the design of the 74283 in the form of a very simple design. Such a design which has p r i m components only and where moreover all components occur in its system is called a top design.

74283 := p r i m E (int(~ = int(~) + int(b)) s y s t e m [ 74283 ]

This is the initial design of the top-down development! With respect to the available primitive building blocks, we adopt a minimalistic approach by restricting ourselves to two simple logical circuits called nand-gate and ground-connection. It is a well-known fact that these are sufficient to construct all other logical circuits. A ground-connection has no input ports and one output port g, specified by g = 0.

We can already put the information which is relevant for the supplier of the primitive building blocks, in the form of a very simple special.design. (Later we shall introduce the terminology that this is the bottom of the design to be made).

7400 := p r i m E (z=(a'b)) GND := p r i m E_ ( g = 0 ) s y s t e m []

Now the top-down development can really begin. In order to decompose the 74283 four-bit adder we employ a component providing for a single-bit full adder. A single- bit full adder is a logical circuit with three input ports a, b, ci and two output ports s and co. The ports cl and co are usually known as 'carry-in' and 'carry-out' respectively. It is specified by the equations s = a @ b @ cl and co = a.b + c~.(a $ b). We shall introduce a new component named 74183 to provide the functionality of a single-bit full adder.

Using four instances of the 74183 and one ground-connection, the four-bit adder can be implemented. This is known as a ripple-carry configuration [11] (p. 87). We refer to the following interconnection diagram as 74283IMPL.

174

84

a3 b3

I 83

a2 b2 al bi ao bo

I I 82 81 30

GND

Fig 3.1. Interconnection diagram 74283IMPL (4-bit adder)

The design step to be taken involves two simultaneous modifications. First of all, the new components 74183 and GND must be added to the design. These new components can be viewed as a two-component 'mini-design', d ~ say.

GND := p r i m E (g = 0 ) 74183 := p r i m G (s = a @ b @ci , co = a.b q- ci .(a @ b))

s y s t e m []

We must concatenate d' with our initial design d. Secondly, the old 74283 must be updated by means of a kind of overwriting insert operation, inserting 74283 := 74283IMPL G (int(~ = int(E) + int(b~)). After this design step, our design d is as follows:

GND := p r i m _G_ ( g = 0 ) 74183 := p r i m E (s = a @ b @ ci, co = a.b + ci.(a @ b)) 74283 := 74283IMPL G (int(~ = int(d) + int(b")) s y s t e m [ 74283 ]

In order to decompose the 74183 single-bit full adder, we employ and-gates, or-gates and xor-gates which are provided by the new components 7408, 7432 and 7486, respectively. These are specified by z = x . y , z = x q- y and z = x @ y, respectively.

Using and-gates, or-gates and xor-gates the single-bit full adder can be implemented. We refer to the following interconnection diagram as 74183IMPL.

175

a

b

C~

: 7 o8

Fig 3.2. Interconnection diagram 74183IMPL (single-bit full adder)

Co

As a matter of fact, verifying that this diagram satisfies the equations of its specification essentially means to verify a pair in the implementation relation E.

Again the design step to be taken involves two simultaneous modifications. First, the new components must be added to the design. These new components can be viewed as a three-component mini-design, d', say.

7408 := p r im E (z = x.y) 7432 := p r i m E ( z = x + y ) 7486 := p r im E ( z = x @ y ) s y s t e m []

We must concatenate d t with our current design d. Secondly, we must insert a new version of 74183 which has 74183IMPL instead of pr im. After this design step, our design d is as follows:

7408 := p r i m E ( z = x . y ) 7432 := p r i m __ ( z = x + y ) 7486 := p r i m E ( z = x @ y ) GND := p r i m E ( g = 0 ) 74183 := 74183IMPL E (s = a @ b @ ci, co = a.b + ci.(a @ b)) 74283 := 74283IMPL E (int(~ = int(h') + int(b)) s y s t e m [ 74283 ]

We could select the last prim-component as the next candidate to be implemented, which typically is a kind of default in top-down development. However, we have no

176

intention to decompose GND, so we permute some pr im components, putting GND first.

In order to decompose the 7486 xor-gate, we employ inverters, which are provided by a new component 7404. An inverter is a logical circuit with one input port p and one output port q. It is specified by q = ~. Using and-gates, or-gates and inverters it is easy to implement a xor-gate. We refer to the following interconnection diagram as 7486IMPL.

X

Y

t' ;~;~ q~ I ip7 04

1 : 7 4 0 8 z i ..... x

I: . . . . . . .

J

7432 z ~ z

Fig 3.3. Interconnection diagram 7486IMPL (exclusive or)

Now the design step is the simultaneous addition of the 7404 and the insertion of the implemented 7486 with 7486IMPL instead of prim. Once more we put GND first. We do not show the resulting d and we proceed immediately with the next design step.

In order to decompose the 7432 or-gate, we shall employ the 7400 hand-gate, which is an available primitive. We also employ two 7404 inverters. We refer to the following interconnection diagram as 7432IMPL.

177

[p 7404 q

p 7404 ql

Fig 3.4. Interconnection diagram 7432IMPL (or circuit)

The design step is to add the 7400 component and to insert a modified 7432 component~ with 7432IMPL instead of prim. To decompose the 7408 and-gate we need no new components. It can be done easily with one 7400 nand-gate and one 7404 inverter. We refer to the following interconnection diagram as 7408IMPL.

1:700 z I Ip740 q Y - - I

Fig 3.5. Interconnection diagram 7408IMPL (and circuit)

Finally we implement the 7404 inverter. We employ one 7400 whose two inputs are connected. We refer to the following interconnection diagram as 7404IMPL.

p 1:74°°z I .... q Fig 3.6. Interconnection diagram 7404IMPL (inverter)

The resulting design is finished because its 'bottom' equals the agreed design with 7400 and GND. We show the resulting design d below.

7400 := pr im E (z=(a:b)) GND := pr im E_ ( g = 0 ) 7404 := 7404IMPL E_ (q =~)

7408 := 7408IMPL 7432 := 7432IMPL 7486 := 7486IMPL 74183 := 74183IMPL 74283 := 74283IMPL s y s t e m [ 74283 ]

This concludes the example.

178

_E (z = _E (z=x+y) _E ( z = = e y ) E (s=a@b@ci, co=a.b+ci.(a(~b)) _ (int(~ = int(~) -F int(b))

4 M o d i f y i n g C o m p o n e n t s

One of the ideas behind the notion of a design is that there is a kind of locality principle which, roughly speaking, can be stated as follows: "It should be possible to implement each component in a design without worrying about the implementation of the other components in the design". In order to make this idea somewhat more precise, we shall define several kinds of correctness-preserving modifications of designs and we shall investigate their properties. There are two (binary) criteria for classifying the modifications and therefore we shall have four kinds of modifications. The first criterion deals with "what is modified?". We consider two cases: either some implementation is modified, or some black-box description is modified. The second criterion deals with "which notion of correctness is adopted?". Again we consider two cases: glass- box correctness and black-box correctness. The modifications are defined to establish the adopted notion of correctness, at least locally for the modified component. Our general scheme is that we have X-preserving Y-modifications for X E {gbc, bbc} and Y E {glass-box, bla~k-box}. We study what happens if we take a correct design (in the X sense) and modify a Y-description. The interesting question is "is the result correct again in the X sense?", or alternatively "has X been preserved?".

Def ini t ion 4.1 Let the wf design d be given as

:= Pi _ Q1

s y s t e m S.

(i) We say that d' is a gbc-preserving glass-box modification of d, (abbreviated as d' is gbc-gb-mod of d), if d' is obtained from d via the replacement of some non-prim component (xj,Pj, Q j) := (xj,PJ, Q j) such that Fj ~- Pj E Qj taking Fj as in the definition of gbc 2.4.

(ii) In the same way we define what it means that d' is a bbc-preserving glass-box modification of d, (abbreviated as d I is bbc-gb-mod of d) by taking Fj as the context corresponding with black-box correctness. []

t79

Let us illustrate this with the example design obtained at the end of section 3. For example, we could decide to invent an alternative implementation of the 7404 inverter. In particular, the fact that 7404IMPL puts a 'load' of two input-ports upon its 'driver', could be a disadvantage, at least from an electronic point of view. Now we could construct the alternative, 7404IMPL2 say, which contains two 7400 hand-gates. The first of these nan&gates has both inputs connected to GND, hence its output is always 1. Then the second 7400 hand-gate is connected to calculate (p-7]') in the obvious way. The following design is a bbc-gb-mod of the design given at the very end of section 3.

7400 := p r i m E (z=(a.b)) GND := p r im E ( g = 0 ) 7404 := 7404IMPL2 E (q =~) 7408 := 7408IMPL E (z = x.y) 7432 := 7432IMPL E ( z = x + y ) 7486 := 7486IMPL E (z = x@y) 74183 := 74183IMPL E (s=a@b@c~, co=a.b+c~.(a@b)) 74283 := 74283IMPL _ (int(s-') = int(ff) + int(b)) s y s t e m [ 74283 ]

The intuition behind a gbc-preserving glass-box modification is that a modification takes place within one component while preserving the local correctness (in the gbc sense) of that component. The question whether preservation of local correctness implies preservation of correctness of the whole design is investigated below.

R e m a r k 4.2 Let d' is gbc-gb-mod of d. Then the proposition that d is gbc =~ d ~ is gbc is false. For a simple counterexample we refer to [7]. []

P ropos i t ion 4.3 If d' is bbc-gb-mod of d, we have

d i s b b c =~ d ' i sbbc . []

The intuition behind the remark and the proposition given above is that gbc designs do not offer implementation freedom but bbc designs do.

Instead of considering correctness-preserving glass-box modifications, one can also consider correctness-preserving black-box modifications. The intuition behind the latter kind of modification is a change of specification for a component which has already been implemented. It does not come as a surprise that when glass-box correctness is adopted, these modifications preserve correctness for the whole design (since the black-box descriptions are in fact not used). It is also easy to see that when black-box correctness is adopted, these modifications may disturb the correctness of the whole design.

The following schema summarises these results. For the modifications which lead to a correct design, the corresponding entry in the schema contains a '+ ' . For the modifications which may lead to an incorrect design, the corresponding entry in the schema is ' - ' .

180

locally preser~oing modifying implementation black-box

description black-box correctness + - glass-box correctness - +

In the above schema we see two entries where the modification may lead to an incorrect design, although locally (i.e. for the replaced component) correctness is preserved. One is tempted to think that by restricting the modifications to those modifications where a term is replaced by a term which is an implementation of that term, the resulting designs are still correct. This is in fact found to be the case indeed for bbc-preserving black-box modifications, but it does not hold for gbc-preserving glass-box modifications [8]. As a final remark we would like to state that in view of proposition 4.3 the methodologically preferred notion of correctness is black-box correctness.

5 Combining Designs

This section has a subtitle which is algebraic operations on designs. To explain this we begin by noting that a design is something like a ~closed area of reuse' with components serving as reusable building blocks. An important approach to constructing large designs is to have composition mechanisms at the level of entire designs. It is important to see that a design is a dosed unit, in lambda calculus theory one would view it as a combinator. Therefore the composition mechanisms at this level take the shape of algebraic operators. In particular we shall have binary operators * and o. Now when dl and d2 are two designs, then we can fit them together by • say, which means that we have a new and larger design denoted by dl * d2.

In fact we will study an algebra of designs. Amongst the typical applications for this approach we expect the manipulation of component libraries. Typically a component library takes the shape of a design. Now two libraries can be joined to make one library: e.g. if dda is a library of reusable components containing device-drivers and dct is a reusable library with software modules about 3D-coordinate transformations, then we can imagine that dan * d~t is a large library providing a starting point when designing a robot-arm controller.

The first operation on designs (denoted by *) is called concatenation. The concatenation of dl and d2 yields a design containing the components of dl and the components of d2 and having as its system simply the concatenation of the systems of dl and d2. We have the following intuition. The designs dl and d2 are considered as ~disjoint' designs and by constructing dl * d2, we simply take a kind of union of their component sets. Conversely, if we can sprit a design d into dl and d2, such that d = dl * d2, then this means that d in fact already consisted of two unrelated parts.

181

Defin i t ion 5.1 (.). Assume designs di and d2 where these designs have no names in common. If they happen to have names in common, one can perform a systematic renaming. Often we even assume that such renamings are done implicitly (cf. c~- conversion in lambda calculus). Let dx and d2 respectively be given by

Xl := p r im _ Mi ul := p r im E_ N1 . . . . . .

x,~l := p r i m E M=~ u,, 2 := p r i m E_ N,~ 2 yl := P1 E Q1 vi := Ai _E B~

, , . . . .

Yz~ := Pl~ E Qzl v~2 := Al2 _E Bl~ s y s t e m [$1,...,S,~1], s y s t e m [T1,...,T,,~]

where P i , - - . , Ptl are not equal to p r im and where A1, . . . , At2 are not equal to pr im. Then we define dl * d2 as

zl := p r i m E__ 3/1 . . .

x. 1 := p r i m ___ M., ul := p r im _E N1

. . ,

u.~ := p r i m E N~2

yl := Pl _= Q1 . . .

:= _

Vl := A1 _ B1 . . °

vt2 := Al2 E Bz2 s y s t e m [S1,. . . ,Sml, T1,...,Tm2].

f'l

In the introduction of this section we already announced an algebra of designs. We shall now show two simple algebraic laws which state that the set of designs together with the operation • constitutes a monoid. Strictly speaking, it is a partial monoid, but when we take into account that we can always perform a systematic renaming, we can also view it as a non-partial monoid.

P r o p o s i t i o n 5.2 (Algebraic properties of *). Let e = s y s t e m []. Then d • e and e • d are always defined and we have

(i) d*e=e*d=d ,

(ii) (dl * d2) * d3 = dl * (d2 * da)

provided in (ii) everything is defined, i.e. we have no name-clashes at the component- level. [3

182

Let us consider the behaviour of * with respect to glass-box correctness (gbc) and black-box correctness (bbc). Because * means to take the union of the component sets of 'disjoint' designs, the following should not come as a surprise.

P ropos i t ion 5.3 (Correctness-preserving properties of .).

(i) dl and d2 are gbc e* dl * d2 is gbc,

(it) dl and d2 are bbc ¢~ dz * d2 is bbc.

provided we have no name clashes at the component level. []

The next operation is composition, denoted by o. It is related to functional composition, usually also denoted by o. Roughly speaking, d~ o d2 is obtained by appending dl to d2, while replacing the pr ims of dl by the elements of the system of d2. There is a notion of validation, by which we mean that we shall define when the composition of dl with d2 is valid. Our terms 'valid' and 'validation' are consistent with [16].

This binary operation o will play a key role in our discussion of validation and hence in our description of design evolution (section 6). Furthermore o will be used in our discussion of parallel development (section 8).

Def ini t ion 5.4 (o). Assume designs dl and d2 where these designs have no names in common. Let dl and d2 respectively be given by

xl := p r i m E 3/1

x~ := p r i m E Ms Yl := P1 E Q1

Ym := Pm E Qm s y s t e m L,

zl := A1 E B1 . . .

zl := Al E Bt sy s t em [$1,. . . , Sn].

We assume that P1, . . . ,Pro are not pr im, whereas some of the A4 may be p r im (1 < i < I). We define dl o d2 as the design given by

zi := A1 ___ B1

zt := At __ Bt zx := $1 E MI

xn := S~ E M~ Yl := P1 E Q1

y.~ := Pm E Qm s y s t e m L.

[]

183

The design dl o d2 can be viewed as a 'layered' design with layers di and d2. We get a highly intuitive view of the construction of di o d2 if we omit the keyword s y s t e m in d2 and the keywords p r i m in di and write the system of d~ as a column vector. We show this below. Write d9 as

Zl

zl

:= Ai

:= Az

and write dl as

[::: B1

E_ Bx

zl := E M1 , , ,

x~ := E Ms Yi := Pi _ Q1

. ° ,

:= E Qm s y s t e m L.

Now one can view the construction of di o d2 as a matter of plugging the system of d2 into the hole corresponding to the pr ims of dl. We have one simple algebraic law for O .

P r o p o s i t i o n 5.5 (Algebraic properties of o).

(dl o d2) o d3 --- dl o (d2 o d3).

provided we have no name clashes at the component level. []

The following example shows an application we have in mind for the operation o.

E x a m p l e 5.6 Let us assume that we have a library consisting of two implemented components and that we have a design d which must use this library. Let the library be given as a design dub. Let dub and d respectively be given as

xl := P1 E_ Qi x~ := p r im E_ Q1 z2 := P2 E Q2 x~ := p r im E Q2 s y s t e m [xl, x2], 27 3 := P3(xl, xr2) E_ Q3

! ! x4 := P4(xl, 2, 3) E_ s y s t e m S(x~, x~, x3, x4).

The instantiation of d with du~ can be described by the composition d o dt~b. We can simplify d o dtib to a design d' which in a certain (semantical) sense is equivalent.

184

zl := P~ - Q1 zl := P1 - Q1 z2 := P2 _ Q2 z2 := P2 = Q2

x~ := x2 - Q2 z4 := P4(Zl,Z2,x3) E Q4 xs := Ps(x'~, x'2) =_ Qs sys tem S(Zl, Z2, Z3, Z4),

! I z4 := P ~ ( ~ , ~ : , z s ) _= Q~ sys tem S(xl, xl, x3, x4),

[]

In the above example we see that the facts that the library is modelled as a design and that the components using it are modelled as a design introduce some overhead. In the example this overhead consists of the components (x~ := xl ___ Q1) and (x~ := x2 E Qz). However the additional components can easily be removed, as shown by the example. The feasibility of this approach may depend on the existence of an abbreviation facility which should provide for 'global' abbreviations, just like the LET constructs of COLD- K. If there is no such abbreviation facility, then the overhead due to the duplication of black-box descriptions (the Q1 and Q2 in the example) may become too large.

In [10] it is shown that at a semmatical level the so-called interchange law holds, i.e. [(dl * d2) o (dz * d4)] = [(d~ o ds) * (dz o d4)]. We shall not go any further into this equation since the interpretation [ ] requires the theory of the Air-calculus which is outside the scope of this paper.

Very much in the same way as we have two notions of correctness for designs, we shall have two different definitions of 'valid', viz. glass-box valid and black-box valid, written as gbv(dl, d2) and bbv(dt, d~), respectively.

Def in i t ion 5.7 Assume designs dl and d2 where these designs have no names in common and let dl and d2 be given as in 5.4. (i) the pair (dl,d2) is glass-boz valid (notation gbv(dx,d2)) if for all i

r~ S~EM~

where r := [z~ E B~] , . . . , [zh 7 Bh], [~h+l = Ah+~], . . . , [z, = A~], a s s u ~ n g that h is the number of p r i m components of d2. (ii) the pair (d~, d2) is blaek-boz valid (notation bbv(d~, d2)) if for all i

A b S~ E Mi

where A := [zl E B1] , . . . , [z~ E Bt]. []

Before actually employing these notions gbv and bbv for formulating the correctness- preserving properties of o, we shall take a look at a few simple properties.

The proposition that for all dx, d~ we have bbv(dl, d2) ~ ( 6 , d2 are bbc) just does not hold in general. This is because bbv is about the "plug-ability" of d2 with respect to dl rather than about the internals of dx and d~. Similarly glass-box validation

185

does not imply glass-box correctness. As a non-triviaJ property we have bbv(dl, d2) =;~ gbv(dl, d2), provided d2 is glass-box correct.

Next we investigate the behaviour of o with respect to gbc and bbc. Intuitively it is clear that the validation conditions gbv and bbv will play a role here.

P r o p o s i t i o n 5.8 (Correctness-preserving properties of o).

(i) (dl, d2 are gbc A gbv(da, d2)) ~ da o d2 is gbc,

(ii) (d~, d2 are bbc A bbv(d~, d2)) ¢* dt o d2 is bbc.

provided dl o d2 is defined. []

The proposition that we have dl o dz is gbc =~ dl, d2 are gbc just does not hold in general [10].

Next we present two simple unary operations called bot and top. We would like to isolate the parts of dl which for given dz play a role in bbv(dl,d2) and the parts of d2 which for given dl play a role in bbv(dl, d2). We cast these parts in the form of designs.

Def in i t ion 5.9 (bot, top). Let d be given as

zx := p r i m E Mt o , ,

x~ := p r i m G M, Yl := /91 -- Q1

y~ := P~ __ Q~ s y s t e m [Sx,. . . , Sz]

where P1 , . . . , P,, are not equal to p r im. We define bot(d) and top(d) respectively as the designs

xl := p r i m G M1 xl := p r i m _ M1 o . * , , .

x~ := p r i m E M~ x . := p r i m G M~ s y s t e m D, Yl := p r i m G Qa

ym := p r i m _G. Q.~ s y s t e m [$1, . . . , St]

where it is understood that in top(d) only those components are retained whose name (xi for some i with 1 < i < n or yj for some j with t _< j __G m) occurs in the system [$ I , . . . , St]. We call bot(d) the bottom of d and we call top(d) the top of d. []

bot(d) and top(d) are not related to the idea of having terms K and T which are minimal and maximal with respect to G in ,~r-calculus. A motivation of the terms bottom and top will be given below.

186

Before actually employing these new operations in connection with validation and correctness issues, we shall take a look at a few simple properties. Let e = sys t em [], then for all d, bot(bot(d)) = bot(d), top(top(d)) = top(d) and top(bot(d)) = e. The following properties relate the unary operations bot and top with the binary operations * and o. For all designs dz and d2 we have

(i) bot(dl * d2) = hot(d1) * bot(d2),

(ii) top(d1 * d2) = top(d1) * top(d~),

(iii) bot(d~ o d2) = bot(d2),

(iv) top(d1 o d2) = top(d1).

provided everything is defined. A design d for which bot(d) = d is called a bottom design and a design d for which top(d) = d is called a top design. We have chosen the terms bottom and top because in our view these notions are related to the top-down and bottom-up models of the software development process. For example in a top-down development process one starts with a given top design and then adds components and implements components, until the remaining p r im components correspond to the primitives which are actually available. In a bottom-up development process one starts with a given bottom design and then adds components and adds system elements until the resulting design meets the actual requirements of the user of the design. We shall take a closer look at the top-down and bottom-up models of the software development process in section 7.

The following proposition confirms our intuition that bot(d) is precisely the part of d which is relevant for bbv(d, d~) and that top(d) is precisely the part of d which is relevant for bbv(d2, d).

P ropos i t i on 5.10 Consider designs d, dl and d2.

(i) bbv(d, dl) ¢* bbv(bot(d),dl),

(ii) bbv(d2, d) ¢:~ bbv(d2, top(d)). []

The following fallacy is essentially due to the fact that glass-box correct designs do not offer implementation freedom.

R e m a r k 5.11 The proposition that for designs dl and d2 we have

gbv(dl,d2) =~ gbv(dl,top(d2)).

just does not hold in general. []

Sometimes it is convenient to introduce an equivalence relation =pp say, on designs where we say that d ~ is a p r im permutation of d, notation d =pp d t, if d ! can be obtained from d by permuting the order of the p r im components. See [10].

187

We shall end this section with a summary of the algebra of designs. The signature of the algebra of designs is shown by the picture given below. We did not include all predicates. For example, the predicate =pp is not shown. The collection of all designs is shown as a circle. For the sake of the picture we view predicates as functions to Boolean values. The set of Boolean values is shown as a circle named Bool. The operations on designs are shown as arrows. The arrow which has been labelled ( ) corresponds with the possibility of constructing a design directly, without using algebraic operations on designs. The constant e is given in proposition 5.2.

~,0

~ b o t , top

6 Des ign Evolut ion

Now we turn our attention to strategies ('design programs') for design manipulation. We shall use a very simple design-development language to denote these design programs. The design programs will be highly non-deterministic. The data that are manipulated by the developer(s) during the execution of a design program may include designs, components, terms (denoting modules) and names. In this section we view these data as belonging to given data types which bring with them certain predicates and operations. In particular, we have predicates bbc, bbv, = etc. and opera t ions . , o, bot, top, etc.

We shall briefly sketch the main ingredients of the design-development language. For the details we refer to [10]. In the design-development language we have assignment statements (an example of an assignment statement is d', d" := bot(d),top(d);). State- ments can be composed with sequential composition, a non-deterministic choice construct (symbol D ) and a repetition construct (keywords while , do and od). We have expressions and expression lists. Expressions may contain operation symbols and variables (an example of an expression is top(d)). Expression lists may contain operation symbols, variables and procedure calls. Assertions may contain expressions, predicate symbols, logical connectives and quantifiers (an example of an ~tssertion is forall d (bbc(d))). Procedures have a list of input parameters and sometimes a list of result parameters. A procedure is either given axiomatically (keywords p r e and pos t ) or it is defined explicitly (keyword def). If a procedure is intended to be executed by

188

a developer, it is called a technique (keyword technique) . If a procedure is meant as a description of an event which is not performed by a developer, it is called an event (keyword event) .

We assume that in the design-development language we have variables of distinct sorts: d , . . . for well-formed designs, c , . . . for components, v, w . . . for names and P, Q , . . . for terms (denoting modules).

In order to have a systematic approach for deriving design programs we shall use methods which come from the field of classical sequential programming [19]. In particular, if we want to derive a design program with a repetition construct, then we shall first look for a suitable invariant.

We shall consider the partial correctness of these design programs and we shall have partial correctness formulae of the form {Az}s{A2}. We shall not use an axiomatisation in the style of Hoare's logic [18] for reasoning about these formulae {A1}s{A2}. Of course this could be done, but we think it would push the level of formalisation too fax.

We are interested in design-programs which describe the manipulation of designs which are valid in a given context. Therefore we must start with a formalisation of what it means if a design is valid in a given context.

Def in i t ion 6.1 (machine&user context, d valid in w.)

(i) A machine&user context w is a pair (d,~, d,) of designs where dm is called the machine of w and ds is called the system user of w.

(ii) Let w = (din, d,), then we say that d is valid in w if bbv(d, d,,) A bbv(d~, d). t2

In a certain way, a machine&user context w constitutes a (simplified) view of the external world, at least from the viewpoint of a developer who has to create a design which is valid in w. The definition can be motivated as follows. The p r i m components of d can be viewed as a specification of all building blocks that are needed by d. When the product described by the design d becomes somehow operational, then the actual building blocks are provided. We view such a collection of actual building blocks as an underlying 'machine'. The design d itself can be viewed as a description of a product to be delivered to the user of d. In general this user has certain requirements with regard to the product described by the design. In our view both providing a machine for a design and providing a product to its user can be described by the binary operation o. In particular, the components provided by the machine can (via o) be plugged into the pr ims of the design. Similarly we can imagine that the system user is assuming a number of primitives (also prims) which become available by means of the system of the design.

Of course, in many practical situations it is not the case that the system user and the machine are formalised as designs, Often the system user is not formalised at all and validation becomes a matter of informal reasoning and negotiating. Nevertheless, we believe that our abstraction might provide some insight for such situations also.

189

If we call the activity of showing that a design d is (black-box) correct verification and if we call the activity of showing that a design d is valid in a machine&user context w validation, then our terminology is consistent with the usual terminology [16]: verification = 'are we building the product right?' and validation = 'are we building the right product?'.

Let us illustrate this with the example of the four-bit adder of section 3. In this example the machine&user context consists of two "designs". The first is the design of the 7400 and GND~ typically this is a design where transistors and resistors occur as components. This 7400 + GND design is the underlying machine. The second is the design of a higher layer, where the 74283 four-bit adder is just a primitive building block; typically this could be the design of a digital computer. This digital-computer design is the system-user.

We see that the validity of a design d in a given machine&user context w depends only on bot(d), top(d) and on w. From an intuitive point of view, this observation can be explained as follows. First of all bot(d) contains precisely all p r im components of d. The p r im components of d can be viewed as a summary of all building blocks that are needed by d. Secondly top(d) has the same system as d and top(d) contains a number of p r im components which can be viewed as specifications of the names occurring in the system of d. The system of d (with these names eliminated) can be viewed as the top-level product to be delivered to the user of d.

In general, it is the task of the developer to (re)establish an invariant, INV say, which depends both on the design d and on the machine&user context (din, ds). This invariant should consist of two parts where the first part deals with validation and the second part deals with verification. Thus INV must express that d is valid in the machine&user context (din, ds) and that d is black-box correct and therefore we define it as follows:

INV : - bbv(d, din) and bbv(ds, d) and bbc(d).

In section 7 we shall take a look at design creation, but it would be wrong to assume that in realistic software development the 'machine&user-context', in which the design is supposed to be valid, is always a constant machine&user context. In current software development practice it may very well be the case that about 50% of the development costs of a product are spent on 'maintenance' [16]. Traditionally, maintenance was classified into software update and software repair, where software repair includes a corrective aspect (see e.g. [16] page 536). In this paper we shall not investigate this corrective aspect. Furthermore, from now on we shall use the term 'design evolution' (because the term 'maintenance' suggests that there might be something like 'wear', which of course is not the case for software products).

We consider design evolution to be the evolution of a design in a changing machine&user context. The developer operates on a variable design which is part of a global state. Also part of this global state is a variable machine&user context w = (d~,ds) . The machine&user context is modified, let us say at certain points in time. We formalise this view by assuming that there are three variables d,~, ds and d. We have the following

190

intuitions for these variables: dm= 'current machine,' ds = 'current system user', d = 'current design'. It is the task of the developer to re-establish the invariant INV.

The following scenario is adopted. We assume a state in which d~, d, and d are such that INV holds. We now assume that the machine&user context of the next state has been modified and we say that an external event has happened. The developer must find a design d' such that after establishing the state modification d := d'; the invariant INV holds again. The developer may do this by acting according to some technique. Let us assume that the change of the machine&user context is such that either the machine is modified or the system-user is modified, but not both. We shall discuss both kinds of machine&user-context-change separately. It is possible to devise many techniques addressing the problem of design evolution; we only indicate some of the simplest techniques. We first discuss a technique which deals with a changing machine .

The following procedure can be used for modelling an external event:

change := event d ~ d'

p r e t r ue

pos t t r ue

After the external event d,~ := change(din); has happened, there is a new machine, but there is still the old system user. The condition bbv(d, din) may be false, but bbv(ds, d) and bbc(d) hold. The developer must restore the invariant and therefore we look for a suitable statement s,~ such that

{INV} dm := change(d,~); s,, {INV}.

One possible technique is based on the idea of an emulator, which is nothing but a 'layer' interpolating between the new machine and the old design. Finding an emulator is described by the following technique which takes designs dm (the new machine) and d (the old design) and yields an emulator dCm.

emulator := t echn ique d,~, d ~ d~m

p r e t r u e

p o s t bbc(d~,~) and bbv(d~m, d,~) and bbv(d, d~m)

Now if emulator(d,,, d) yields d~m, then we might say that d~,,odm is 'd-equivalent' with the old machine, by which we mean that bbv(d, d~m o d,~) holds. Thus it is possible to make the old design 'run' upon d~m o d~ by constructing the composition d o ( d ~ o din). By lemma 5.5 (associativity of o) this is the same as (d o d~,~) o din. If we assume that the developer has modification rights with respect to d but not with respect to the machine, he should replace the old design d by (d o d~m). This indicates that we can take the following statement for s~:

191

d := d o emulator(d,~, d);

Let us illustrate this with the example of the four-bit adder of section 3. Suppose that the supply of 7400 nand-gates becomes exhausted, whereas there is a rich supply of 7402 nor-gates, say. Then we could have an emulator design d~m as follows:

7402 := p r im E ( z = ( a + b ) ) GND := p r im E ( g = 0 ) 7400 := 7400IMPL E (z=(a.b)) s y s t e m [7400,GND]

where 7400IMPL implements the functionality of the hand-gate using nor-gates and ground-connections only.

P ropos i t i on 6.2 {INV} dm :=change(d,~); d := d o emulator(d~, d); {INV} []

The case of a changing s y s t e m user can be treated very much along the same lines as the case of a changing machine. One possible technique is based on the idea of a simulator.

7 Design Creation

Now we turn our attention to the problem of design creation, which in its most general form is to create a design d such that INV is established. In order to keep things simple, we shall study the problem of design creation in a restricted setting where we focus on the verification aspect. Therefore we assume that before the actual execution of a design program starts, the boundaries of the design to be created are already fixed, by which we mean that somehow the bottom and the top of the design to be created are determined. Since the bottom and the top of a design are designs themselves, we can model this situation by assuming that there are two given designs db and dr. We consider the top as given up to permutation of p r im components. Therefore we have the following postcondition of the design creation.

POST := bot(d) = db and top(d) =pp dt and bbc(d).

It is possible to give a criterion for the selection of db and d,. We see that if the machine&user context is (din, ds), then db and d, should be chosen such that bbv(db, d,~) and bbv(ds, dr) hold. Of course it is always possible to derive a db and d, from (din, d,) mechanically, but our approach is somewhat more general.

Let us illustrate this with the example of the four-bit adder of section 3. Indeed, in this example we have chosen a certain d,, viz. the design with one p r im component 74283 and with s y s t e m [ 74283 ]. Also we have chosen a certain rib, viz. the design with two p r im components 7400 and GND and with s y s t e m [].

192

It is not reasonable to assume that the developer can create a large design in one step; instead he adds components and modifies existing components, one at a time. Therefore we assume that there is one variable d which always contains the 'current design' and we shall focus on design programs which contain a repetition construct.

There are many (loop) invariants which could be derived from POST. We shall investigate two possibilities. The first possibility will be investigated below and leads to the derivation of a design-program which corresponds to the top-down approach. The second possibility will be investigated after that, leading to the derivation of a design program which corresponds to the bottom-up approach. These design programs are related to the models of the development process given in [17].

We begin with t op -down deve lopment . In order to obtain an invariant, we take the postcondition POST as a starting point. POST consists of three conjuncts and a candidate invariant is obtained by simply omitting the first conjunct (as suggested in [19] section 16.2). This yields top(d) =pp dt and bbc(d), i.e. during the development process the top of the design remains constant (up to permutation of p r im components) and furthermore black-box correctness is adopted as a methodological principle. We strengthen this assertion by requiring that all components (except possibly those in db) play a role in the system of the design d. In order to formulate this precisely we need an auxiliary definition:

Defini t ion 7.1 Let d be a design.

(i) cset(d) denotes the set of component names of d and sys(d) denotes the set of component names that occur in the system of d.

(ii) The binary relation <~ on the set of component names of d is defined by xl <~ x~ :¢¢ xl occurs in the implementation of the component named z2.

(iii) The binary relation <d is defined as the transitive closure of <~.

(iv) Let 3' be some subset of cset(d), then X 1 __<d • :¢:~ Xl E '~ V 3x 2 E "~/" X 1 < d X2" []

Roughly speaking, we can view a design d as a set of components, identified by names, where some components are part of other components. Therefore we shall sometimes refer to <d as the 'part of' relation.

Now we can express the requirement that all components (except possibly those in db) play a role in the system of the design d as a condition forall v (v E cset(d) --* (v E cset(db) or v <d sys(d))). This condition guarantees that no implementation effort is spent on components which will not be used. This yields the following invariant:

TD_INV : - top(d) =v, dt and

bbc(d) and

forall v (v C cset(d) ~ (v C cset(db) or v <a sys(d))).

193

Now we can develop a design program based on this invariant. The technique td given below has two input parameters (db and dt). It is given by an explicit definition and it uses one variable (d). After execution of an initialisation statement and a repetition construct, it yields the value of d as its result.

td := t echn ique db, dt

def d := dr;

whi le no t bot(d) = db do d := td.step(d); od;

d

where td_step satisfies the partial-correctness assumption {TD_INV A bot(d) ¢ db} d := td.step(d); {TDINV}.

Let us illustrate this with the example of the f0ur-bit adder of section 3. Recall that in the example the assignment d := dt was in fact executed, viz. at the point where we said "This is the initial design of the top-down development." Recall also that in the example we had several points where we said "In order to decompose the . . . , we shall employ the . . . " etc.; each such point marks the beginning of the execution of d := td_step(d); . Finally recall also that in .the example we reached a point where we could say: "The resulting design is finished because its bottom equals the agreed bottom design with 7400 and GND." This corresponds with a positive outcome of the stop-crlterion hot(d) = db.

R e m a r k 7.2 Under the given assumption for td_step we have

{bot(db) - db and top(d~) = d,} d := td(db, d,); {POST).

(where the precondition simply expresses that db is a bottom design and that dt is a top design). D

We now turn our attention to td_step. It will be found out that there exist several techniques which satisfy the assumption for td_step. Therefore we shall investigate techniques which we shall call td_stepo , td_stepD etc.

In [10] a formal treatment of techniques td_step0 and td.stepl is given. Due to space- limitations, we shall restrict ourselves to an informal sketch of these.

We assume that there is an operation 'insert' which serves for inserting a component into a design by overwriting an existing component (which has the same name as the component to be inserted).

A first technique td..stepo could follow a naive approach. The idea is as follows: use an auxiliary technique td_impl say, which takes a design d and which selects the last p r i m component and transforms it into a black-box correct non-prim component.

194

td_step0 := t echn ique d

de f insert( d,td_impl( d) )

Let us illustrate this with the example of the four-bit adder of section 3. Indeed, at a certain point we said: "To decompose the 7408 and-gate we need no new components. It can be done easily with one 7400 nand-gate and one 7404 inverter. We refer to the following interconnection diagram as 7408IMPL", etc. Formally this corresponds with a last p r im component (7408 := p r im _E (z = x.y)) which is transformed into (7408 := 7408IMPL E_ (z = x.y)). The latter component must be shown to be correct in a context with three p r i m components, viz. GND, 7400 and 7404.

The approach of td-stepo is a kind of naive approach because it does not Mlow for the creation and insertion of new p r im components. We shall now describe an improvement with respect to td_step0. We could define a technique which describes the creation and insertion of one new p r im component; however, we shall not do this because it is essential for the top-down approach as expressed by TDA~V that no new pr im component is introduced unless it is used immediately. We prefer a technique which describes both the creation of new p r im components and the transformation of an existing p r im component into a non-prim component whose implementation uses all new p r im components. This leads us to td_stepl which is an improved version of td_step0. It uses an auxiliary technique called td.spec_impl which performs the selection of the last p r im component, the creation of a set of new p r im components and the transformation of the selected p r im component into a non-prim component. The set of new p r im components is represented as a bottom design (d').

We can use the binary operation * of section 5 for describing the addition of the new p r im components to the current design.

td_stepl := t echn ique d

de f d', c' := td_spec_impl(d);

insert(d' * d, c')

Let us illustrate this with the example of the four-bit adder of section 3. At some point of that development we said: "In order to decompose the 74183 single-bit full adder, we employ . . . which axe provided by new components 7408, 7432 and 7486, respectively." Then these new components were viewed as a mini-design d ~ and we concatenated d with d and furthermore we said: "Secondly, we must insert a new version of 74183 which has 74183IMPL instead of prim." In fact this was an example of the execution of d', c' := td-~pecAmpl(d); i n~g(d ' * d, ¢).

td_stepl raises two related problems. The first problem is that the develop~ has no choice in selecting the p r i m component to be implemented (although he can influence later choices by thinking 'in advance'). For the p r im components which are in the top design d, he has no influence at all on the order in which they axe selected. The second problem is that it may be hard to make sure that the order of the p r i m components

195

will match the order of the p r im components in the bottom design db. We shall remedy these problems by describing the possibility that the developer modifies the order of the p r i m components. We formalise this by defining a technique td_step2.

td-step2 := t echn ique d ~ d'

p re t r u e

pos t d' d =pp

Let us illustrate this with the example of the four-bit adder of section 3. Indeed, at a certMn point we said: "We could select the last p r im component as the next candidate to be implemented - which typically is a kind of default in top-down development. However, we have no intention of decomposing GND, so we permute some p r i m components, putting GND first." This was a typical execution of td_step2.

It is possible to execute td_step by choosing between td_stepl and td_step2. Using the non-deterministic choice construct 0 from the design-development language, we define td_step as

td.step := technique d

d e f d' := td_stepl(d); 0 d' := td_step2(d);

d I

Let us consider the total-correctness question for td, which is as follows: can every execution sequence which is according to the top-down model but which is not ready yet, be completed to a full execution sequence? In other words: is it the case that a top-down development process essentially never can get stuck? The answer is positive: every execution sequence can be completed. But the main reason for this is the reflexivity of ___. From a practical point of view, this answer is of little interest. As soon as executability considerations or efficiency considerations (at the product level) play a role, either 'thinking in advance' or 'backtracking' is needed.

It is interesting to note that as long as the developer works according to the top-down technique, it does not matter if he knows the difference between black-box correctness and glass-box correctness. This is formulated more precisely in [I0].

Our description of the top-down technique should be considered as open-ended. One can think of techniques td_step3 , td_step4 , etc. For example, td_step3 could describe the possibility of back-tracking where components are removed and where non-prim components are transformed into p r i m components, td_step4 could describe the possibility of adding p r i m components which need not be 'part of' the system of the current design, but which happen to be present in rib. A general form of the top-down technique could be based on a technique td-step given as follows.

196

td_step := t echn ique d

d' td_stepi(d); de f D i=l,2,...,n :=

d' []

Finally we consider a kind of completeness question: can every bbc design d be obtained by means of a top-down development? The obvious answer is no, because of the restriction in the top-down invariant that all components (except those of the given bottom) must be used in the system. This negative answer is not a disadvantage of td.

Now we turn our attention to b o t t o m - u p deve lopment . In a completely dual manner we could also formalise the bottom-up approach. Again in [10] this has been done completely formally. Here we restrict ourselves to showing the invariant and two remarks. We obtain the bottom-up invariant by taking again the postcondition POST as a starting point. POST consists of three conjuncts and a candidate invariant is obtained by simply omitting the second conjunct (as suggested in [19] section 16.2). This yields hot(d) = db and bbc(d), i.e. during the development process the bottom of the design remains constant and black-box correctness is adopted as a methodological principle. We need not strengthen this assertion by requiring that all components are built in terms of primitive components, since this is taken care of by the fact that we consider only well-formed designs.

Remark: the difference between (i) adopting glass-box correctness for the newly inserted components and (2) adopting black-box correctness for them, is relevant. This means that there are two versions of the bottom-up strategy, where one yields a gbc design and the other a bbc design. This is a difference with respect to the top-down strategy.

As a second remark we consider a kind of completeness question: can every bbc design d be obtained by means of a bottom-up development? The answer is positive, for a given d is easily built up from a bottom design by means of a number of steps, adding components and system elements.

8 Design Partition

In this section we want to consider a number of possibilities for introducing parallel developments, where two or more designers work simultaneously on (parts of) a design. First of all, the locality principle of components can be exploited for parallel development. There is a very simple splitting to be done before the actual parallelism can start. Two distinct component names must be selected such that the corresponding components are non-prim. Then for each of these component a bbc-gb-mod can be performed, and the resulting components can be re-inserted into the original design.

A second possibility is to split a design d into two designs dl and d2 such that d = dl*d2. There is a splitting to be done before the actual parallelism can start. This is described by the technique 'split.' given below.

197

split. := t echn ique d ~ dl, d2

p re t r u e

pos t dl * d2 = d

Based on the splitting of d into dl and d2, it is possible to indicate which parts of the machine&user context are relevant for dl and which parts of the machine,user context are relevant for d2. Of course in many practical situations it is not the case that the system user and the machine are formalised as designs. Nevertheless, in such situations it is probably still the case that certain parts of the machine&user context are relevant for dl and that other parts of the machine&user context are relevant for d2.

A third possibility is to perform a splitting according to o. There is a splitting to be done before the actual parallelism can start. This is described by the technique 'split°' given below.

split° := t echn ique d ~ dl, d2

p re t r u e

pos t dl o d2 = d

As before it is possible to indicate which parts of the machine&user context are relevant for dl and which parts of the machine&user context are relevant for d~. This is relatively simple, due to the fact that we can view d~ o d2 as a layered design with layers dl and d2. We see directly that dl must be validated with respect to top(d2) and the bottom of the system user, i.e. bot(ds). Similarly we see directly that d2 must be validated with respect to top(d,~) and bot(dl). For a formalisation of the three ideas presented in this section we refer to [10].

9 C o n c l u s i o n s

It is remarkable that the top-down and bottom-up models of the development process arise in a systematic manner by applying (at the design program level) an approach which comes from the field of classical sequential programming. The possibilities for deriving invariants and design-programs describing design creation by this approach have by no means been exhausted in section 7. It is interesting to investigate other possibilities.

As an important point with respect to the applicability of the approach, we would like to explain here how designs are made available in COLD-K. There it is possible to denote designs, but the concrete syntax used is slightly different with respect to the syntax used in this paper. In particular, a COLD-K design begins with the keyword DESIGN and furthermore each component (except for a so-called abbreviation-type component) begins with the keyword COMP where there are two options. The first option is that the

198

component is of the form C0MP x : Q, which corresponds with our x := p r i m ___ Q. The second option is that the component is of the form COMP x : Q := P, which corresponds with our x := P _ Q. In COLD-K one must use a semicolon as a component separator and there is an obvious keyword SYSTEM. In this way it is clear that the approach worked out in this paper applies directly to designing in COLD-K.

With respect to future work we think that, first of all, the concepts investigated in this paper should be applied to more case studies. We can use the language COLD-K for such case studies. This is possible because COLD-K is based on an algebraic system (viz. the Mgebra CA of class descriptions) and has %~r for paraxneterisation and because it has components and designs as built-in language constructs [5]. A large example is ~ven in [12], [13].

The approach taken in this paper belongs to the area of "programming-in-the-large". When applying the concepts presented here in practice, the amount of formal texts involved (designs with their associated specifications, implementations, abbreviations) tends to become very large. Therefore it is important to have tool support for the design transformations of this paper. The feasibility of this has been investigated in [20] where a configuration management system for COLD-K is developed.

10 A c k n o w l e d g e m e n t s

The author wants to thank F.E.J. Kruseman Aretz, J.A. Bergstra, H.B.M. Jonkers, C.P.J. Koymans, J.H. Obbink, and G.R. Renardel de Lavalette for their contributions, their help and their cooperation on the subject of this paper. Special thanks go to R.J. Bril who carefully read an earlier version of this paper.

R e f e r e n c e s

[1] M. Wirsing. Algebraic Specification. Report MIP 8914, Universit~t Passau, Fakult~t ffir Mathematik und Informatik, Innstrasse 33, 8390 Passau.

[2] D. BjSrner, C.B. Jones (eds.) The Vienna development method: the meta- language. Springer Verlag LNCS 61, ISBN 3-540-08760-4 (1978).

[3] C.B. Jones. Systematic software development using VDM, Prentice-Hall Interna- tional, ISBN 0-13-880725-6 (1986).

[4] J.M. Spivey. Understanding Z, a specification language and its formaJ semantics, Cambridge Tracts in Theoretical Computer Science 3, ISBN 0-521-33429-2 (1988).

[5] L.M.G. Feijs, H.B.M. Jonkers, C.P.J. Koymans, G.R. Renardel de Lavalette. For- mal definition of the design language COLD-K. Preliminary Edition, April 1987, ESPRIT document METEOR/ t7 /PRLE/7 .

199

[6] J.A. Bergstra, J. Heering, P. Klint. Module algebra. CWI Report CS-R8617, May 1986.

[7] L.M.G. Feijs. A formalisation of design structures. Proceedings of Comp Euro 88 - system design: concepts, methods and tools pp. 214-229, Brussels, Belgium, April 11-14, 1988. IEEE Computer Society Press.

[8] L.M.G. Feijs. A formatisation of design structures. ESPRIT document ME- TEOI~/t7/PRLE/4.

[9] M. Wirsing. Algebraic description of reusable software components. Proceedings of Comp Euro 88 - system design: concepts, methods and tools, Brussels, Belgium, April 11-14, 1988, pp. 300-312, IEEE Computer Society Press.

[10] L.M.G. Feijs. Correctness-preserving transformations of designs. ESPRIT document METEOR/t8/PRLE/6.

[11] D. Winkel, F. Prosser. The art of digital design, an introduction of top-down design. Prentice Hall, Inc. ISBN 0-13-046607-7 (1980).

[12] L.M.G. Feijs. Formal specification of a text editor. ESPRIT document ME- TEOR/t9/PRLE/3 (May 1989).

[13] L.M.G. Feijs. Systematic design of a text editor. ESPRIT document ME- TEOR/t9/PRLE/4 (May 1989).

[14] M. Broy, P. Pepper. Program Development as a Formal Activity. IEEE Transac- tions on Software Engineering, Vol. SE-7, No 1, January 1981, 14-22.

[15] N.G. de Bruijn. Generalizing Automath by means of Lambda-typed Lambda Cal- culus. Proceedings of the Maryland 1984-1985 Special Year in Mathematical Logic and Theoretical Computer Science.

[16] B.W. Boehm. Software engineering economics. Prentice-Hall, INC., Englewood Cliffs, New Jersey 07632. ISBN 0-13-822122-7

[17] L.M.G. Feijs, J.H. Obbink. Process models: methods as programs. ESPRIT '85, Status report of continuing work, The commission of the European Communities (Editors), Elsevier Science Publishers B.V. (North-Holland), 577-591. (Nat. Lab. Manuscript NL 13.249).

[18] C.A.R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, Vol 12, Number 10, pp. 576-580 and p. 583, October 1969.

[19] D. Gries. The science of programming. Springer-Verlag New York, Heidelberg, Berlin. ISBN 0-387-90641-X.

[20] E.C. van Oijen. Configuration management for COLD-K. Master's thesis, Eind- hoven University of Technology, Department of mathematics and computing science (August 1989).

PART III

C O L D

I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

N o r m a n ' s D a t a b a s e Modu la r i s ed in C O L D - K . . . . . . . . . . . . . . . . . . 205

1 I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 2 N o r m a n ' s d a t a b a s e in V D M . . . . . . . . . . . . . . . . . . . . . . . 205 3 N o r m a n ' s d a t a b a s e in C O L D - K . . . . . . . . . . . . . . . . . . . . . 209 4 T h e ' cha l lenge ' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 5 Solu t ion in C O L D - K . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 6 Discuss ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 7 Acknowledgemen t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 A p p e n d i x A. C O L D - K s t a n d a r d l ib ra ry of d a t a t ypes . . . . . . . . 230

P O L A R : A P i c t u r e - O r i e n t e d L a n g u a g e for A b s t r a c t Rep re sen t a t i ons . . . . . 233

1 I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 2 In formal p r e sen t a t i on of P O L A R . . . . : . . . . . . . . . . . . . . . 234 3 T h e formal l anguage C O L D - K . . . . . . . . . . . . . . . . . . . . . 236 4 F o r m a l def in i t ion of P O L A R . . . . . . . . . . . . . . . . . . . . . . 239 5 Tools , exper iences and conclusions . . . . . . . . . . . . . . . . . . . 246

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 A p p e n d i x A. P O L A R in B N F . . . . . . . . . . . . . . . . . . . . . 248 A p p e n d i x B. P O L A R in C O L D . . . . . . . . . . . . . . . . . . . . . 251 A p p e n d i x B. P O L A R in P O L A R . . . . . . . . . . . . . . . . . . . . 275

Inhe r i t ance in C O L D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

1 I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 2 Concep t s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 3 T h e bas ic mechan i sm . . . . . . . . . . . . . . . . . . . . . . . . . . 282 4 Ref inement of t h e m e c h a n i s m . . . . . . . . . . . . . . . . . . . . . . 289 5 Conclus ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

A Process Spec i f ica t ion F o r m a l i s m Based on S ta t i c C O L D . . . . . . . . . . . 303

1 303 2 304 3 306 4 314 5 320 6 332 7 8 9

I n t r o d u c t i o n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T h e C O L D - S l anguage . . . . . . . . . . . . . . . . . . . . . . . . . P S F / C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Seman t i c s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E x a m p l e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ex tens ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C o m p a r i s o n of P S F / C wi th s imi lar l anguages . . . . . . . . . . . . . 333 Conclus ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

Introduct ion

Part III is about COLD, which is the main outcome of METEOR. It embodies the idea of a "single linguistic framework", which has been one of the primary goals of the project. As it turned out, the project was much richer in nature then just aiming at a single linguistic framework. As shown in Part II and IV, much research on requirements & design and on algebraic specification has been conducted which has a value of its own. Yet this richness has contributed to a setting where the goal of a single linguistic framework could be realised. Of course the idea of a single language that has absorbed all useful concepts is a myth. Instead of that a wide-spectrum language must be based on a selection amongst the possible concepts. The key issue in the design of COLD is the fact that it integrates algebraic specification, modularisation, parameterisation and formalised stepwise design. During the METEOR project the development of COLD has been influenced by M. Wirsings ASL and by J. Bergstra's module algebra. Numerous case studies have been done to compare the expressive power of COLD with other METEOR formalisms: PLUSS, RAP, ALGRES, ASF, PSF, ERAE. At the end of METEOR there is a fixed and stable kernel version COLD-K. Although it is primarily meant as a kernel language, it is a useful specification language in its own right. This has been shown in many case studies and two of them a~e contained in this volume. The first is Feijs' contribution "Norman's Database modularised in COLD-K" which has been triggered by a challenge put forward by C.B. Jones during the workshop. Feijs explains Jones' challenge and as it turns out the METEOR project has answered it; COLD-K can be used to describe the NDB satisfying the meta requirements that were put forward by Jones. For the paper corresponding to Jones' presentation of the challenge and solution in a VDM style we refer to VDM '90.

The second case study using COLD-K concerns R. van den Bos' pictorial language PO- LAR. Graphical representations are a useful add-on to the support for formal methods. The POLAR approach aims at the graphical representation of module structures. Since the entire METEOR project was driven by the conviction that formal specification is useful, a formal specification of this new language is presented. In the paper "POLAR, A Picture-Oriented Language for Abstract Representations" by R. van den Bos, L. Feijs and R. van Omrnering this bootstrapping process is described. COLD-K is used as a meta-language to formally interpret POLAR pictures as COLD-K texts.

This idea of using COLD-K to cope with new concepts reappears in the paper "Inheri- tance in COLD" by H. Jonkers. Rather than proposing a language extension, COLD-K is used to give an algebraic characterisation of the phenomenon of inheritance. In this way a precise comparison between COLD-K and traditional object-oriented languages (TOL) is made. At the same time it paves the way for extending COLD to include inheritance.

To add concurrency to COLD-K is quite non-trivial and there are several routes towards that goal. One route is explored in the paper "A process Specification Formalism based on Static COLD" by J. Baeten, J. Bergstra, S. Mauw and G. Veltink. An integration takes place of the concurrency theory ACP and the language COLD-K. The result is a formal specification language PSF/C. From COLD it gets data type specifications and modular structure with imports and exports. From ACP it gets the processes and their interaction.

Norman's Database Modularised in COLD-K

L . M . G . Feijs *

Abstract

In this paper we present a COLD specification of a database, originally presented in VDM, discussing various aspects of modularisation.


At the Meteor workshop in September 1989 in Mierlo (that led to these proceedings) C.B. Jones presented a lecture entitled "Modularizing the Formal Description of a Database". Its abstract was as follows:

A simple database system will be introduced by a "flat" formal description. Various reasons for wanting modularised descriptions will be reviewed with re-use being identified as the most pressing need. A formal model will be sketched to raise the question of whether this can actually be written in any particular specification language.

Moreover, C.B. Jones put forward that he was wondering whether the modul~isation could be done at all. In particular, the question was raised whether this could be done in COLD-K.

The author decided to accept the challenge and C.B. Jones provided copies of his slides which the author took as a starting point. This paper presents a solution of the problem, following closely Jones' starting points - however avoiding some of the apparent problems. Then a discussion by e-mail between C.B. Jones, J.S. Fitzgerald and the author led to the current paper. The original work of C.B. Jones and J.S. Fitzgerald has been presented in VDM '90 [1].

2 N o r m a n ' s D a t a b a s e in V D M

Starting point for the specification exercise of this paper is a specification, originally formulated in VDM [2] and first presented by C.B. Jones. It can be found in [3], where a database system called NDB is described in various ways; the ~rst of these descriptions is a 'fiat' or single level description. 'NDB' stood for 'Nor- man's Database' and was described in [4]; it eventually became an IBM product, named the 'Non-programmer Database' [5]. We reproduce the essentials of the flat

*Philips Research Laboratories Eindhoven

206

VDM specification below. See also appendix A of [3]. We have split the fiat VDM specification into three parts; the first part consists of the module header with its parameter list and export list together with a few type definitions. The second part defines of the state space and includes the definition of invariants and initialisation conditions. The third part presents the definitions of four operations. Each part will be shown as a block of formal text which is preceded by an introductory text and followed by technical explanations - again as informal text.

We begin with the first part of the flat VDM specification which is called FLAT- NDB. It is parameterised with respect to the types Eid, Esetnm and Rnm (for Enti ty identifier, Enti ty set name, and Relation name respectively) which are left unspecified. There is an export list followed by the definition of four types called Maptp, Tuple, Rinf and Rkey.

Module FLAT-NDB Parameters

types Eid, Esetnm, Rnm : Triv Exports

operations ADDES, ADDENT, ADDREL, ADDTUP, ... Definitions

types

Maptp = {1:1, I:M, M:I, M:M}

Tuple :: fv : Eid tv : Eid

• Rinf :: tp : Maptp r : Tuple-set

Rkey :: rim: [ Rnm ] fs : Esetnm ts : Esetnm

We add some informal explanation and intuition to the above formal description. Eid is a set of entity identifiers. In order to introduce typing constraints, entities are grouped into 'Entity Sets' whose names are in Esetnm. Entity sets do not have to be disjoint. The same entity identifier can belong to more than one set (e.g. an individual could be both a customer and an employee): later we shall see that the

w~ grouping therefore is stored as an entity-set map which is Esetnm Eid-set. Many relations can be stored in a database and, as one might expect, they can be named (Rnm). Next, let us explain the definitions of Maptp, Tuple, Rinf and Rkey in detail. The type Maptp (for Map type) is an enumerated data type taking the values 1:1, I:M, M:I and M:M. The type Tuple contains all pairs of Eid values and for a Tuple value t, its first field can be denoted as fv(t) (the 'from value') and similarly its second field can be denoted as tv(t) (the 'to value'). The type Rinf is about so-called Relation-information. Most of the typing information will be captured in the relation key and therefore the 'Relation Information' (Rinf) is

207

little more than a set of tuples. What is extra is information about the 'Mapping type ' (Maptp) which records whether the relation is restricted to be one-to-one (1:1), one-to-many (I:M), etc. An Rinf value r has a mapping type denoted as tp(r) and if for example tp(r) = 1:1, the relation must be one-to-one. An Rinf value r has a set of tuples denoted as r(r). The type Rkey (for Relation key) contains triples with an optional Rnm field (for Relation name) and two entity-set names. For an Rkey value rk, its relation name is denoted as nm(rk) and its entity-set names are denoted as fs(rk) (for ' f rom set ') and ts(rk) (for ' to set ') . For example the triple mk- Rkey(OWNS, PERSON, CAR) could be the Rkey value for a car pool administration where cars are owned by persons. This concludes t h e technical explanations of the above block of formal VDM text.

Next we proceed with the second part of the flat VDM specification covering the definition of the state space (keyword 'state ') , the definition of an invariant (keyword 'inv') and the initialisation condition (keyword 'init').

state Ndb :: esm : Esetnm m Eid-set

rm : Rkey m> Rinf

inv(mk-Ndb(esm,rm)) A A Vrk E dom rm •

let mk-Rkey(nm,fs, ts) = rk in let mk-Rinf( tp,r) = rm(rk) in ~fs, ts} C_ dom esm A (tp = I:M ~ Vzl~2 e ~ . tv(tl) = tv(~2) ~ f , ( t l ) = / v ( ~ ) ) ^ (tp = M:I =~ Vtlt2 E r " fv( t l ) = fv(t2) =~ tv(tl) = tv(t2)) A ( tp = 1:1 ~ vt~t~ ~ r . /~ ( t~ ) = f~( t~) ~ , t~(t~) = t , ( t 2 ) ) ^ Vmk-Tuple(fv, tv) e r . fv e esm(fs) ^ t v e esm(ts)

init(ndb) A ~db = mk-N~b({) ,O)

Let us explain the definition of the state space, the invariant and the initialisation condition. As usual, the VDM specification is built around a state; the state space of the database system consists of two variables called esm and rm. The esm (for entity set map) part of the state space is a mapping from entity-set names to sets of entity identifiers. This can be viewed as a kind of dynamic type system where the entity-set names act as ' types'. The rm (for relation map) part of the state space is a mapping from relation keys to Rinf values. Note that there is a kind of overloading in the sense that a name can be used for more than one relation (e.g. the name 'OWNS' might relate departments to machinery and people to cars). The invariant essentially consists of five clauses - with common quantification and 'let ' constructs. The first clause is {fs, ts} C dora esm which states tha t the entlty-set names (acting as types) of the relation keys involved must be known to the dynamic type system esm. The second, third and fourth clauses state that the contents of the actual relation r must satisfy the constraints of the map type tp stored for it. For example ff tp = 1:1, the relation r must be a one-to-one relation. The last clause of the invariant is Vmk- Tupte(fv, tv) E r. fv E esm(fs) A tv E esm( ts) which states that

208

all values fv and tv stored in some relation r must satisfy the type check induced by the entity-set names of the 'from set' and the ~to set' in the relation key of r. Finally let us explain the initialisation condition: initially the database system does not know any entity-set names and there are no relations stored yet.

Next we proceed with the third and last part of the fiat VDM specification introducing the operations which add entity-sets (ADDES) , add entities ( A D D E N T ) , add relations (ADDREL) or add tuples (ADDTUP) .

operations

ADDES(es: Esetnm) ext wr esm : Esetnm ,n Eid-set pre es ~. dom esm

post esm = u {es ())

ADDENT(es: Esetnm, eid: Eid) ext wr esm : Esetnm m Eid-set pre es E dora esm

post esm = 'esm ~ {es ~-~ "esm(es) U {eid}}

ADDREL(rk: Rkey, tp: Maptp) ext rd esm : Esetnm , m Eid-set

wr rm : Rkey - -~ Rinf pre ts( k)} c dom esm ^

rk ~ dom rm

post rm = 'rm U {rk ~-~ mk-Rinf( tp, {})}

ADDTUP(fval , tvah Eid, rk: Rkey) ext wr rrn : Rkey - -~ Rinf

rd esm : Esetnm r% Eid-set pre rk E dom rm A

let ~i = ~(~m(~k),r ~ r ( ~ ( ~ k ) ) U {mk-T~ple(:val, tvaO} ) in inv- Ndb( mk-gdb( esm,rmt { rk ~-* r/}))

post let ri = # ( -~ - ( rk ) , r ~-, r('-r'mm ( rk ) ) U ( rnk- Tuple(fvat, tval) } ) in

rm = 'rm t {rk ~-* ri}

End FLAT-NDB

Let us explain the above operations now. ADDES has modification rights (keyword 'wr' for write) with respect to the esm part of the state space. There is a precondition which states that the entity-set name es to be added may not be known to the dynamic type system esm yet. The postcondition states that after execution of ADDES(es ) the new esm value equals the old esm value (denoted by e---s-~m) where the maplet {es ~-r {}} has been added. So the newly added entity set name es has the empty set associated with it. The A D D E N T operation again has modification rights with respect to the esm part of the state space. There is a precondition which states

209

that the entity-set name es for which an entity eid is to be added must be known to the dynamic type system esm. The postcondition states that after execution of ADDENT(es,eid) the new esm value equals the old esm value (denoted by e~'m) where the entry for esm(es) is modified by means of a kind of 'overwriting insert'

operation (denoted by ~). The new value for esm(es) is ~-~(es) U (eid), i.e. eid has been added. The ADDREL operation has read access (keyword 'rd' for read) to the esm part of the state space and it has modification rights with respect to the rm part of the state space. There is a precondition which states that the entity-set names (acting as types) in the proposed relation key rk must be known to the dynamic type system esm and that this rk must be fresh. ADDREL(rk,tp) stores the map type tp and makes the newly created relation empty. Finally ADDTUP serves to add a tuple to an existing relation. To add a tuple ink-Tuple(fval,tval) to a relation with key rk involves a complicated precondition check. The first part of the precondition is rk E dom rm, stating that the relation key must be known. The second part of the precondition is let ri . . . . in inv-Ndb(mk-Ndb(esm,rmt{rk ~-~ r/))), stating that addition of the new tuple may not violate the invariant. Here ri denotes the new Rinf value that would be obtained by modifying (operator #) the tuple-set part of rm(rk) - the tuple-set part being addressed by field selector r. Note that r(rm(rk)) is the set of tuples for rk and that r(rm(rk))U {mk-Tuple(fval, tval)} is the set of tuples obtained by adding the new tuple. Of course the postcondition of ADDTUP states that the new r/value (containing the newly added tuple) is stored in the database system.

We left out some of the deletion operations DELES, DELENT, DELREL and DELTUP, which could be added easily but which add no additional relevant information to serve the discussion of this paper. For similar reasons we have simplified matters by leaving out the association of values to entity identifiers.

3 N o r m a n ' s D a t a b a s e in C O L D - K

Before addressing the modularisation issues raised by Jones, we first give an almost straightforward version of FLAT-NDB in COLD-K. We have split this COLD-K specification into three parts; the first part consists of the specification of the parameters, a number of module-instantiations to introduce data types like Tuple, Tuple_set etc. as well as an import and export llst. The second part consists of the definition of the state space including the definition of invariants and initialisation conditions. The third part presents the definitions of the operations - here called 'procedures'. Just as before, each part will be shown as a block of formal text which is preceeded by an introduction and followed by technical explanations.

We begin with some general explanations concerning the language. The specification below is in COLD-K [6], which is a kernel language and which has no built-in data types and no syntactic sugar. Therefore the specification below should not simply be judged on its length (indeed it is not extremely compact) nor on the absence of nice symbols. Also it should not be judged for the clumsy module-instantiation expressions (APPLY APPLY REI~IAME etc.) which are needed to get maps and sets. User-oriented versions which are better in that respect are on their way.

210

It should be emphasised that essentiM/y complete descriptions of the case study in VDM were provided by C.B. Jones and J.S. Fitzgerald. From a scientific point of view it was very important that a number of questions on it were asked (by C.B. Jones). In fact the example is a very attractive one for this kind of investigations and exploring it in more than one formalism is certainly worthwhile. For most parts of the COLD-K text below there is no claim of originality, because the VDM specifications have been used; things have simply been copied mutatis mutandis to COLD-K.

The modularisation constructs employed below take the form of algebraic operators at the level of modules - [7] and [8]. In particular, ' import ' is a binary operator on modules; to import M1 into M2, one writes IMPORT M1 INT0 M2. Also 'export ' is a binary operator taking a signature and a module to yield another module; to export

from M, one writes EXPORT f] FROM M. Finally 'renaming' is an operator taking a mapping from names to names and a module to yield another module; to apply the mapping p to a module M one writes RENAME p IN M.

Parazneterisation is based on %7r calculus [9]; this approach is uniform in the sense that module expressions have formal parameters ranging over modules. Each %- abstraction has a parameter restriction which is a module again. To abstract from M(z) with parameter restriction R one writes LAMBDA x : R OF M(x). To apply a parameterised module P to an actual parameter A one writes APPLY P TO A, where of course A must satisfy the parameter restriction of P.

The first part of the COLD-K text below begins with an auxiliary class description called PARAMS, introducing the sorts (=types) Eid, Esetnm and Rnm. After that the definition of FLAT_MDB begins. There are a number of local module-instaatiations ca/led TUPLE, TUPLE_SET etc. to introduce data types like Tuple and Tup le_se t . After that there is an export llst and an import list and we conclude the first part with the keyword CLASS which opens the main module, the details of which will follow in the second and third part.

LET PARAMS :=

CLASS

SORT Eid

SORT Esetnm

SORT Rnm

END;

LET FLAT_NDB := LAMBDA X : PARAMS OF

LET TUPLE := APPLY APPLY RENAME

SORT Iteml TO Eid,

SORT Item2 TO Eid, SORT Tup TO Tuple, FUNC proj1: Tup -> Iteml TO fv, FUNC proj2: Tup -> Item2 TO iv, FUNC tup: Iteml # Item2 -> Tup TO mk_Tuple

IN TUP2_SPEC TO X TO X;

211

LET TUPLE_SET := APPLY RENAME

SORT Item TO Tuple,

SORT Set TO Tuple_set

IN SET_SPEC TO TUPLE;

LET MAPTP := RENAME

SORT Item TO Maptp,

FUNC a: -> Item TO I_I,

FUNC b: -> Item TO i_M,

FUNC c: -> Item TO M_i,

FUNC d: -> Item TO M_M

IN ENUM4;

LET EID_SET := APPLY RENAME

SORT Item TO Eid,

SORT Set TO Eid_set IN SET_SPEC TO X;

LET RKEY := APPLY APPLY APPLY RENAME SORT Iteml TO Rnm,

SORT Item2 TO Esetnm,

SORT Item3 TO Esetnm,

SORT Tup TO Rkey,

FUNC projl~ Tup -> Itemi TO run, FUNC proj2: Tup -> Item2 TO fs, FUNC proj3: Tup -> Item3 TO is, FUNC tup : Iteml # Item2 # Item3 -> Tup TO mk_Kkey

IN TUP3_SPEC TO X TO X TO X;

EXPORT

SORT Eid,

SORT Rkey,

SORT Maptp, SORT Esetnm,

PROC ADDES: Esetnm ->,

PROC ADDENT: Esetnm # Rid ->,

PROC ADDREL: Rkey # Maptp ->,

PROC ADDTUP: Eid # Eid # Rkey ->

FROM

IMPORT X INTO

IMPORT TUPLE INTO

IMPORT TUPLE_SET INTO IMPORT MAPTP INTO

IMPORT EID_SET INTO IMPORT RKEY INTO

CLASS

We add some informal exp lana t ion and in tu i t ion to t he above formal descr ip t ion . F i r s t of all t he re is a bas ic modu le called PARAMS which is of the form CLASS . . . END.

212

It introduces the sorts Eid, Esetnm and Rnm, which are left unspecified. This basic module serves as a parameter restriction. In particular, it occurs in a clause LAMBDA X : PARAMS OF ... which means that any actual parameter module to be provided for X must at least provide for three sorts with the names Eid, Esetnm and Rnm.

FLAT_NDB begins immediately after the definition of PARAMS and it can be seen that FLAT_NDB is a parameterised description: the LAMBDA X : PARAMS OF clause opens a scope where X is assumed to provide for the sorts Eid etc. After that there are five local module-instantiations ca/led TUPLE, TUPLE_SET, MAPTP, EID_SET and RKEY. They use standard modules which are not worked out in this paper but which are

summarised in the appendix. The standard modules come from a standard library and they cover data types which in VDM are built-in. For example TUP2_SPEC is such a standard module introducing the sort Tup of 2-tuples (pairs) with constructor function tup: I~ceml # Item2 -> Tup and projection functions proj I and proj2. It is a parameterised class description, with two LAMBDA parameters. Here we need an instantiation of this TUP2_SPEC and since it has two parameters we also need two actual parameters. This explains why we must write LET TUPLE := APPLY APPLY ... TO X TO X. Furthermore we need a RENAME clause to make the nanaes of TUP2_SPEC fit with respect to the names in X. In particular TUP2_SPEC expects sorts Iteml and Item2 whereas X offers (amongst others) the sort Eid. Therefore things in TUP2_SPEC get renamed, which is written as RENAME ... IN TUP2_SPEC where at the place of the dots we have the actual renaming (= mapping from names to names). Iteml is mapped to Eid and Item2 is mapped to Eid - which is needed by way of fitting morphism. Furthermore Tup is mapped to Tuple, proj i is mapped to fv, proj2 is mapped to tv and finally Cup is mapped to rak_Tuple - following the VDM tradition. The effect of all this is that TUPLE is a module introducing the sort Tuple of 2-tuples (pairs) with constructor mk_Tuple: Eid # Eid -> Tuple and projection functions fv and iv. Similarly TUPLE_SET introduces the sort Tuple_sec whose objects are finite sets of tuples. MAPTP introduces the sort of Maptp with values I_I, I_M, M_I and M_M. EID_SET introduces the sort Eid_set of finite sets of entity identifiers. RKEY introduces the sort Rkey whose objects are triples consisting of an Rnm object and two Esetnm values; its projection functions are called ms, fs and ts whereas the constructor function is called rak_Rkey - again following the VDM tradition. We did not model the first field of Rkey as optional - modelling optionals is easier in VDM than in COLD-K. An these special modules are imported into the main module and so is the formal module parameter X. The export list mentions the procedures ADDES, ADDENT, ADDREL and ADDTUP; because an export list must be a complete signature it is necessary to mention the sorts involved as well: Eid~ Rkey, Maptp and Esetnm.

Next we proceed with the second part of the COLD-K specification covering the definition of the variable functions (keywords VAR and FUNC) spanning the state space, the definition of the invariant assertions, which are just predicates (keyword PRED) called INVO, INV1 etc. and the initialisation conditions (keyword INIT).

FUNC esm: Esetnm -> Eid_se~ VAR

FUNC tp_rm: Rkey -> Maptp VAR

FUNC r_rm: B.key -> Tuple_set VAR

213

PRED INVO: DEF FORALL rk:Rkey

( tp_rm(rk)! <=> r_rm(rk)! )

PRED INVI: DEF FORALL rk:Rkey

( tp rm(rk)! => ( esm(fs(rk))! AND esm(ts(rk))! ) )

PRED INV2: DEF FORALL rk:Rkey

( tp_rm(rk)! AND r_rm(rk)! => ( LET tp: Maptp; tp := tp_rm(rk);

LET r: Tuple_set; r := r_rm(rk); tp = I_M =>

FOP~LL tl:Tuple,t2:Tuple (is_in(tl,r) AND is_in(t2,r) =>

(tv(tl) = tv(t2) => fv(tl) = Tv(t2) ) ); tp = M_I =>

FORALL tl:Tuple,t2:Tuple (is_in(tl,r) AND is_in(t2,r) =>

(fv(tl) = fv(t2) => tv(tl) = tv(t2) ) ); tp = I_I =>

FORALL tl:Tuple,t2:Tuple (is_in(tl,r) AND is_in(t2,r) =>

( f v ( t l ) = f v ( t 2 ) <=> t v ( t l ) = t v ( t 2 ) ) ) ) )

PRED INVS:

DEF FORALL rk:Rkey ( r_rm(rk)! =>

( LET r : Tuple_set; r := r_rm(rk); FORALL fv:Eid, tv:Eid (is_in(mk_Tupls(fv,tv),r) =>

(is_in(fv,esm(fs(rk))) AND is_in(tv,esm(ts(rk))) ) ) ) )

AXIOM INIT => ( FORALL s:Esetnm ( NOT esm(s)! ); FORALL rk:Rkey ( NOT tp_rm(rk)! ); FORALL rk:Rkey ( NOT r_rm(rk)! ) )

Let us explain the definition of the state space, the invariant and the initialisation condition. Just as in VDM, a COLD-K specification can be built around a state; the state spa~e of the database system consists of three variables called esm, tp_rm and r_rm. These functions have been marked VAR which means that they may be subject to modifications - due to procedure calls. The function esm maps entity set names to sets of entity identifiers; as before, it embodies a kind of dynamic type system. We have split the rm variable that was present in the VDM specification. This splitting gives rise to two variables called tp_rm and r_rm. The splitting is not a direct consequence of changing from VDM to COLD-K but it has been done because we feel tha t the mapping type and the actual relation are conceptually distinct things. The distinction is related to the distinction between a database schema and the database contents. Therefore we like to store them in distinct parts of the state space. The function tp_rm maps keys to mapping types and the function r_rm maps keys to

214

sets of tuples. This splitting gives rise to an additional invariant stating that tp_rm and r_rm have the same domains. This invariant is described as a predicate INVO - which is just a predicate whose definition refers to tp_nn and r_rm. COLD-K has no built-in notion of an invariant; instead, its built-in dynamic logic is powerful enough to express the invariance of INVO under ADDES, ADDENT etc. Let us have a closer look at the definition of INVO. Its defining assertion (DEF clause) is that for all relation keys the following holds: ( t p _ r m ( r k ) ! <=> r_rm(rk) ! ). The exclamation mark ('! ') denotes definedness which is a built-in notion of COLD-K. So INVO just says: ' tp_rm(rk) is defined i f f r_ rm(rk) is'. Next we discuss INV1. It is similar to the first clause ({fs, ts} C_ dom esm) of the invaria~t in VDM. The main difference is that we use definedness rather than explicit reference to 'dom esm'. Of course this could be done, also in COLD-K, but we employ a slightly different style; a~ advantage of this style is that it shows how COLD-K can have variables which are functions, rather than just nullary variables of complex data types (maps). To explain INV2 it suffices to say that it essentially consists of three clauses, corresponding with the second, third and fourth clause of the VDM invariant. It states that the contents of each actual relation r_rm(rk) must satisfy the constraints of the map type tp_rm(rk) stored for it. Finally INV3 states that all values f v a~d t v must satisfy the type check induced by the entity-set names of the corresponding 'from set' and ' to set'. Finally let us explain the initlalisation condition: COLD-K has a built-in predicate INIT which holds in the initial state only. The initialisation axiom contains three clauses - one for each variable.

Below we give the third part of FLAT_NDB presenting the definitions of the procedures. The main procedures are ADDES, ADDENT, ADDREL and ADDTUP and furthermore there are a number of auxiliaries called ADDREL', ADDREL'' and ADDTUP' (the quotes are just part of the identifiers).

PROC ADDES: Esetnm ->

MOD esm

PRED preADDES: Esetnm

PAR es:Esetnm

DEF NOT esm(es)!

AXIOM FORALL es:Esetnm

(pre_ADDES(es) =>

[ ADDES(es) ]

(esm(es) = empty AND

FORALL other:Esetnm,e:Eid_set

( NOT other = es =>

(esm(other) = • <=> PREV esm(other) = e ))))

PROC ADDENT: Esetnm# Eid ->

MOD esm

PRED pre_ADDENT: Esetnm # Eid

PAR es:Esetnm, eid:Eid

DEF esm(es)!

215

AXIOM FORALL es:Esetnm, eid:Eid (pre_ADDENT(es,eid) =>

[ ADDENT(es,eid) ]

(esm(es) = ins(eid,PREV esm(es)) AND FORALL other:Esetnm,e:Eid_set ( NOT other = es =>

(esm(other) = e <=> PREV esm(other) = e ))))

PROC ADDREL': Rkey # Maptp -> MOD tp_rm

PRED pre_ADDREL': Rkey # Maptp PAR rk:Rkey, mtp:Maptp DEF esm(fs(rk))! AND esm(ts(rk))! AND NOT tp_rm(rk)!

AXIOM FORALL rk:Rkey, mtp:Maptp (pre_ADDREL'(rk,mtp) =>

[ ADDREL'(rk,mtp) ] (tp_rm(rk) = mtp AND

FORALL other:Rkey, m:Maptp ( NOT other = rk =>

(tp_rm(other) = m <=> PREV tp_rm(other) = m ) ) ) )

PROC ADDREL'': Rkey -> M0D r_rm

PRED pre_ADDREL": Rkey PAR rk:Rkey DEF NOT r_rm(rk)!

AXIOM FORALL rk:Rkey (pre_ADDREL"(rk) =>

[ ADDREL" (rk) ] (r_rm(rk) = empty AND

FORALL other:Rkey, r:Tuple_set ( NOT other = rk =>

(r_rm(other) = r <=> PREV r_rm(other) = r )) ) )

PROC ADDREL: Kkey # Maptp -> PAR rk:Kkey, mtp:Maptp DEF (ADDREL'(rk,mtp)

; ADDREL''(rk) )

PRED PAR DEF

pre_ADDREL: Rkey # Maptp rk:Rkey, mtp:Maptp pre_ADDREL'(rk,mtp)

PROC ADDTUP': Eid # Eid # Rkey -> M0D r_rm

AXIOM FORALL fval:Eid, tval:Eid, rk:Rkey ( r_rm(rk)! =>

[ ADDTUP'(fval,tval,rk) ]

216

(r_rm(rk) = ins(mk_Tuple(fval,tval),PREV r_rm(rk)) AND FORALL other:Rkey, r:Tuple_set ( NOT other = rk =>

(r_rm(other) = r <=> PREV r_rm(other) = r ) ) ) )

PRED PAR

DEF

pre_ADDTUP: Eid # Eid # Rkey fval:Eid, tval:Eid, rk:Rkey tp_rm(rk)! AND r_rm(rk)! AND [ ADDTUP~(fval,tval,rk) ] INV2 AND INV3

PROC PAR DEF

ADDTUP: Eid # Eid # Rkey -> fval:Eid, tval:Eid, rk:Rkey pre_ADDTUP(fval,tval,rk)?; ADDTUP~(fval,tval~rk)

END{FLAT_NDB}

Let us explain the above operations now. The ADDES operation has modification rights (keyword MOD) to the esm part of the state space. There is a precondition predicate pre_ADDES. After that there is an axiom which uses the always assertion, also called boz assertion - which is derived from Harel's dynamic logic. The main notational device used for this is a pair of square brackets. Although it may look quite innocent, this notation embodies a very powerful and non-trivial expression mechanism. It is called an 'always assertion' because [ ADDES(es) ] A can be read as 'always after ADDES(es) we have that A holds'. The assertion after the square brackets should be viewed as a postcondition. We restrict ourselves to statements of partial correctness here - although COLD-K has expression means for certain forms of termination as well. The postcondition of ADDES(es) states that esm(es) = empty and that all other entity set names (o the r ) are known to the type system iff they were known in the previous state and i f they were known, they have the same set (e). To read the clause FORALL o t h e r : E s e t n m , e : E i d _ s e t . . . is rather subtle and depends on the built-in rules relating equality and definedness ('=' is strict with respect to ' : ' ) . It should be mentioned that the PREV operator is similar to the hooking operator of VDM, except that it may be applied to complete assertions and expressions. ADDENT has modification rights with respect to esm. There is a precondition predicate pre_ADDENT. The postcondition of ADDENT(es,eid) states that e id is inserted into the set of PREY esm(es) where we use the ins operator on sets, which comes from the standard library (via EID_SET which is imported). The definition of ADDREL proceeds in two steps. The first step concerns two auxiliaries called ADDREL' and ADDREL' ' which serve to update tp_rm and t p_ r respectively. The second step is to define ADDREL ( rk ,mtp) as the sequential composition of ADDREL' ( rk ,mtp) and ADDREL' ' ( rk ) . The precondition of ADDREL is just the precondition of ADDREL'. Also the definition of ADDTUP proceeds along the lines of first defining an auxiliary. The auxiliary is called ADDTUP' and the effect of ADDTUP' ( fva l , tva l , rk ) is to add mk_Tuple ( f v a l , t v a l ) to the relation addressed by rk without worrying about any constraints. It has modification rights with respect to the database contents, (the actual relation) which is modelled by r_rm. This ADDTUP' is used to specify the precondition of the real ADDTUP, which of course may not violate the invariant. ADDTUP' does not change the domains of tp_rm and r_rm (cf. INVO) and because of

217

the allocation of modification rights, there is no danger of ADDTUP violating INV1. p re_ADDTUP(fva l , tva l , rk ) is defined to hold for all states where ADDTUP' can be done without violating INV2 and INV3; ttarel 's box assertion here expresses the 'weakest liberal precondition' of the program ADDTUP' ( f v a l , t v a l , rk) with respect to INV2 and INV3. Of course, operationally viewed, pre_ADDTUP does not really execute ADDTUP'; in fact predicates cannot even have side effect! Instead, pre_ADDTUP characterises the states where a (hypothetical) execution of ADDTUP ' would preserve the invariants. Finally ADDTUP is defined as the sequential composition of a guard based on pre_ADDTUP and the invocation of ADDTUP'. It is possible to verify properties of this specification (cf. the proof obligations of the VDM method). In particular it is derivable that INIT => INV0 AND INV1 AND INV2 AND INV3. It is also derivable that the operations do not violate the invariant; for example if we define a FLAT-NDB operation as an arbitrary invocation of one of the exported procedures (taking care of the preconditions), then the invariance of the predicates INV0, INVl etc. can be expressed in the language COLD-K, as shown below. Because invariance can be stated in the language, we do not need built-in proof obligations.

PROC DEF

FLAT_NDB_op: -> ( ADDES(SOME es:Esetnm (pre_ADDES(es))) I ADDENT(SOME es:Esetnm,eid:Eid (pre_ADDENT(es,eid))) I ADDREL(SOME rk:Rkey, mtp:Maptp (pre_ADDREL(rk,mtp))) ADDTUP(SOME fval:Eid, tval:Eid, rk:Rkey (pre_ADDTUP(fval,tval,rk)))

)

PRED INV:

DEF INVO AND INVI AND INV2 AND INV3

PRED POI:

DEF INIT => INV

PRED P02:

DEF INV => [ FLAT_NDB_op ] INV

Now the Proof Obligations p01 and P02 hold in all states. This concludes the first COLD-K version of Norman's database.

4 The 'Challenge'

As pointed out by C.B. Jones it should be possible to generalise the approach of FLAT-NDB, abstracting away from certain rather specific choices, such as the restriction to binary relations. This should be done by introducing a parameterised module TYPED-RELATION in such a way that the functionality of FLAT-NDB can be obtained by a particular instantiation of TYPED-RELATION. The essential ideas behind the generalisation step are proposed by C.B. Jones and they are summarised in the following table.

218

type checking esm = Esetnm ,n Eid-set tpc! Eid× Etp--* ]]3 binary relation n-ary fs / ts, fv / tv Tpm = Fsel m Etp map types normalisation Maptp = {1:1, I:M,M:I,M:M} (Fsel-set × Fsel)-set

We add some explanation to this table. The left column refers to FLAT-NDB. The right column refers to TYPED-RELATION, i.e. the proposed generalisation. The type checking which in FLAT-NDB is governed by the esm mapping (Esetnm - ~ Eid- set), can be generalised to a type-checking function tpc. Assuming some elementary set of entity types (Etp), tpc must be of type Eidx Etp--. ]B, where IB denotes the set of Booleans. The generalisation step from binary relations to n-ary relations can be viewed as the transition from a mapping {fv, tv} --~ Eid towards a mapping Fsel m ~ Eid, where FseI is an elementary set of 'Field selectors'. In this view, a map from Fsel to Eid is a ~generalised tuple'. When dealing with n-ary relations, the typing information must be expressed in a mapping of type Tpm which is defined as Fsel m~ Etp. Such amapping tpm: Fsel rn Etp can be viewed as a 'generaJJsed product type'. There is also a generalisation of Maptp. A wide class of 'normali- sations' (also called 'functional dependencies') of sets of tuples can be described by specifying a set of constraints, each of which requires that the value under one Fsel is determined by the values under a specified set of Fsels. Thus if Fsel = {Fs, Ts}, a many-to-one relation might have a single constraint {{Fs},Ts}.

In [3] this is worked out further, using a yet-to-be-fully-defined module formalism. C.B. Jones has put forward this example as a 'challenge problem' and below we give a solution using the modularisation and parameterisation constructs of COLD-K. This solution does not require the Relation[,] construct used by C.B. Jones and J.S. Fitzgerald, which roughly speaking denotes

U TYPED-RELATION ~]. Relation P

where p ranges over all possible instantiation parameters of TYPED-RELATION. The main reason why the solution below works is that we split the state space of the database into two parts, where one part deals with the dynamic typing mechanism and where the other part is about the mapping types and actual data to be stored in the database. It can be argued that it is a good idea to separate these two parts anyhow. When interpreting the challenge in the sense that the Ndb state components

Ndb :: esm : Esetnm Eid-set rm : Rkey m> Rinf

may not be separated, the solution below does not meet the challenge. On the other hand, a TYPED-RELATION module of the required generality will be constructed and used in Section 5; in this sense, the challenge is met.

219

5 Solut ion in C O L D - K

The COLD-K text below begins with two auxiliary class descriptions called FSEL and TYPESYSTEM. The first auxiliary just mentions the sort Fse l of field selectors. The second auxiliary introduces the sorts Eid, Etp, and Rum and two predicates called t p c and known. After that the definition of TYPED_RELATION begins. It is parameterised with two module parameters X and Y. The first parameter has FSEL as its parameter restriction, whereas the second parameter has TYPESYSTEM as its parameter restriction. In the scope of the LAMBDA X : FSEL OF LAMBDA Y : TYPESYSTEM OF clause there are a number of local module-instantiations called TPM, FSEL_SET etc. to introduce data types like Tpm and F s e l _ s e t . After that there is an import list followed by the CLASS . . . END part of TYPED_RELATION.

LET FSEL :=

CLASS

SORT Fsel

END;

LET TYPESYSTEM :=

CLASS

SORT Eid

SORT Etp

SORT Rnm

PRED tpc: Rid # Etp VAR PRF~ known: Etp VAR

END;

LET TYPED_RELATION :=

LAMBDA X : FSEL OF

LAMBDA Y : TYPESYSTEM OF

LET TPM := APPLY APPLY RENAME

SORT Iteml TO Fsel,

SORT Item2 TO Etp,

SORT Setl TO Fsel_set,

SORT Map TO Tpm,

FUNC app: Map # Iteml -> Item2 TO type

IN MAP_SPEC TO X TO Y;

LET FSEL_SET := APPLY RENAME

SORT Item TO Fsel,

SORT Set TO Fsel_set IN SET_SPEC TO X;

LET FSEL_SET_X_FSEL := APPLY APPLY RENAME

SORT Iteml TO Fsel_set,

SORT Item2 TO Fsel,

SORT Tup TO Fsel_set_x_Fsel,

220

FUNC tup: Iteml # Item2 -> Tup TO mk_Tuple IN TUP2_SPEC TO FSEL_SET TO X;

LET NORM := SORT Item SORT Se~

IN SET_SPEC

APPLY RENAME TO Fsel_set_x_Fsel, TO Norm

TO FSEL_SET_X_FSEL;

LET TUPLE := APPLY APPLY RENAME

SORT Iteml TO Fsel,

SORT Item2 TO Eid,

SORT Setl TO Fsel_set~

SORT Map TO Tuple, FUNC app: Map # Itemi -> Item2 TO value

IN MAP_SPEC TO X TO Y;

LET TUPLE_SET := APPLY RENAME

SORT Item TO Tuple, SORT Set TO Relation

IN SET_SPEC TO TUPLE;

LET GRKEY := APPLY APPLY RENAME

SORT Iteml TO Rnm,

SORT Item2 TO Tpm, SORT Tup TO GRkey,

FUNC proj1:Tup -> Iteml TO nm,

FUNC proj2: Tup -> Item2 TO tpm,

FUNC tup : Iteml # Item2 -> Tup TO mk_GRkey

IN TUP2_SPEC TO Y TO TPM;

IMPORT X INTO

IMPORT Y INTO

IMPORT TPM INTO

IMPORT FSEL_SET INTO IMPORT FSEL_SET_X_FSEL INTO

IMPORT NORM INTO

IMPORT TUPLE INTO

IMPORT TUPLE_SET INTO

IMPORT GRKEY INTO

CLASS

FUNC tp_rm: GRkey -> Norm VAR

AXIOM INIT => FORALL rk:GRkey ( NOT tp_rm(rk)! )

PROC ADDREL~: GRkey # Norm -> MOD tp_rm

PRED pre_ADDREL': GRkey # Norm PAR rk:GRkey, norm:Norm

DEF NOT tp_rm(rk)!; FORALL fs:Fsel (is_in(fs,dom(tpm(rk))) => known(type(tpm(rk),fs)) )

221

AXIOM FORALL rk:GRkey, norm:Norm (pre_ADDREL'(rk,norm) =>

[ ADDREL'(rk,norm) ] (tp_rm(rk) = norm AND

FORALL other:GRkey, m:Norm ( NOT other = rk =>

(tp_rm(other) = m <=> PREV tp_rm(other) = m ) ) ) )

FUNC r_rm: GRkey -> Relation VAR

AXIOM INIT => FORALL rk:GRkey ( NOT r_rm(rk)! )

PROC ADDREL": GRkey -> M0D r_rm

PRED pre_ADDREL'': GRkey PAR rk:GRkey DEF NOT r_rm(rk))

AXIOM FORALL rk:GRkey (pre_ADDREL"(rk) =>

[ ADDREL''(rk) ] (r_rm(rk) = empty AND

FORALL other:GRkey, r:Relation ( NOT other = rk =>

(r_rm(other) = r <=> PREY r_rm(other) = r ) ) ) )

PR0C ADDREL: GRkey # Norm -> PAR rk:GRkey, norm:Norm DEF (ADDREL'(rk,norm)

; ADDREL''(rk) )

PRED PAR DEF

pre_ADDREL: GRkey # Norm rk:GRkey, norm:Norm pre_ADDREL'(rk,norm) AND pre_ADDREL''(rk)

PRED INV0:

DEF FORALL rk:GRkey ( tp_rm(rk)! <=> r_rm(rk)! )

PRED INVI: DEF FORALL rk:GRkey

(tp_rm(rk) ) => ( FORALL fs:Fsel

(is_in(fs,Uom(tpm(rk))) => known(type(tpm(rk),fs)) ) ) )

PROC ADDTUP': Tuple # GRkey -> M0D r_rm

AXIOM FORALL t :Tuple, rk: GRkey ( r_rm(rk)! =>

222

[ ADDTUP'(t,rk) ] (r_rm(rk) = ins(t,PREV r_rm(rk)) AND

FORALL other:Gikey, r:Relation ( NDT other = rk =>

(r_rm(other) = r <=> PREV r_rm(other) = r ))))

PRED pre_ADDTUP: Tuple # GRkey PAR t:Tuple, rk:GRkey DEF tp_rm(rk)) AND r_rm(rk)! AND

[ ADDTUP'(t,rk) ] I~V2 AND INV3

PROC ADDTUP: Tuple # GRkey -> PAR t:Tuple, rk:GRkey DEF pre_ADDTUP(t,rk)?;

ADDTUP'(t,rk)

PRED DEF

INV2: FORALL rk:GRkey ( tp_rm(rk)! =>

( FORALL s:Fsel_set, f:Fsel (is_in(mk_Tuple(s,f),tp_rm(rk)) =>

( FORALL ti:Tuple, t2:Tuple (is_in(tl,r_rm(rk)) AND is_in(t2,r_rm(rk)) =>

(restrict(s,tl) = restrict(s,t2) => (value(tl,f) = value(t2,f) ) ) ) ) ) ) )

PRED INV3: DEF FORALL rk:GRkey

(r_rm(rk)) => FORALL t:Tuple (is_in(t,r_rm(rk)) =>

(dom(~) = dom(tpm(rk)) Am FORALL fs:Fsel (is_in(fs,dom(t)) =>

(tpc(value(t,fs),Zype(tpm(rk),fs)) ) ) ) ) )

PROC DEF

TYPED_RELATION_op: -> ( ADDREL(SOME rk:GRkey, norm:Norm (pre_ADDREL(rk,norm))) I ADDTUP(SOME %:Tuple, rk:GRkey (pre_ADDTUP(%,rk))) )

PRED INV: DEF INV0 AND INVI AND INV2 AND INV3

PRED P01: DEF INIT => INV

PRED P02: DEF INV => [ TYPED_RELATION_op ] INV

END {TYPED_RELATION};

223

We add some explanation to the above definition of TYPED_RELATION. The local modules serve to introduce specific instantiations of standard data types. They introduce Tpm whose elements are maps from Fee l to Etp, serving as genera]ised product-types; F e e l _ s e t whose elements are sets of Feel , serving as candidate- keys; F s e l _ s e t _ x _ F s e l whose elements are pairs (s: F s e l _ s e t , f :Fse l ) , serving as functional-dependency facts; Norm whose elements are sets of functional-dependency facts, serving as normallsation-principles; Tuple whose elements are maps from Fse l to Eid, serving as generalised tuples; R e l a t i o n whose elements are sets of tuples, serving as relations. Finally there is a sort GKkey (for Generalised Relation key whose elements are pairs consisting of a relation name and a generalised product type. In the CLASS . . . END part of TYPED_RELATION we see the two variables spanning the state space tp_rm and r_rm with their operations ADDREL and ADDTUP. The variables and operations are very similar to those of FLAT_NDB but they have been generallsed to cope with a typesystem governed by t pc and known as well as with n-ary relations. The original rm: Rkey m Rinf mapping is split into two functions, viz. tp_rm and r_rm. As in FLAT_NDB, this gives rise to an invariaat tp_ rm(rk ) : ¢~ r_rra( rk) !. Recall that a procedure (keyword PR0C) is like a VDM operation; the keyword MOD is like the 'wr' of VDM. This concludes our explanation of TYPED_RELATION.

The next part of the specification should be viewed as preparatory work for finally arriving at an instantiation of TYPED_RELATION. Therefore we develop the particular version of the X and Y parameters of TYPED_RELATION to fit Norman's approach. The module NORMANS_FSEL below introduces the enumerated sort Fee l whose elements are Fs and Ts and serve as field selectors. Later it will serve as the actual X parameter. The module NORMANS_SORTS introduces the primitive sort Eid! whose elements serve as entity identifiers. It a/so introduces Esetnm whose elements serve as entity-set names, and Ram whose elements serve as relation names.

LET NORMANS_FSEL := RENAME

SORT Item TO Fsel,

FUNC a: -> Ttem TO Fs,

FUNC b: -> Item TO Ts

IN ENUM2;

LET NORMANS_SORTS :=

CLASS

SORT E i d SORT E s e t n m

SORT Rnm

END;

Next we deal with the ' type system' of Norman's database which is a grouping of entities into sets, where these sets have entity-set names associated with them. This is covered by NORMANS_TYPESYSTEM below. After a local module instantiation which is part of an import list, we have the main module of NORMANS_TYPESYSTEM with state component e s m and with procedures ADDES and ADDENT, a function t p c and a predicate known.

224

LET NORMANS_TYPESYSTEM :=

IMPORT APPLY RENAME

SORT Item TO Eid~

SORT Set TO Rid_set

IN SET_SPEC TO NORMANS_SORTS INT0

IMPORT NORMANS_SORTS INTO

CLASS

FUNC esm: Esetnm -> Eid_set VAR

AXIOM INIT => FORALL s:Esetnm ( NOT esm(s)! )

PROC ADDES: Esetnm ->

MOD esm

PRED pre_ADDES: Esetnm

PAR es:Esetnm

DEF NOT esm(es)!

AXIOM FORALL es:Esetnm

(pre_ADDES(es) =>

[ ADDES(es) 3 (esm(es) = empty AND


( NOT other = es =>

(esm(other) = e <=> PREV esm(other) = e ))))

PROC ADDENT: Esetnm #Eid ->

M0D esm

PRED pre_ADDENT: Esetnm# Eid

PAR es:Esetnm, eid:Eid

DEF esm(es)!

AXIOM FORALL es:Esetnm, eid:Eid

(pre_ADDENT(es,eid) =>

[ ADDENT(es,eid) ] (esm(es) = ins(eid,PEEV esm(es)) AND


( M0T other = es => (esm(other) = e <=> PREV esm(o~her) = e ))))

PRED tpc: Eid# Esetnm

PAR eid:Eid, es:Esetnm

DEF is_in(eid,esm(es))

PRED known: Ese tnm PAR e s : E s e t n m DEF e s m ( e s ) !

END (NORMANS_TYPESYSTEM};

225

Let us explain the above COLD-K text now. The unnamed module instantiation serves to introduce a specific instantiation of a standard data type. It introduces E id_se t whose elements axe sets of entity identifiers. The function esm: Esetnm -> E i d _ s e t is an important variable function, i.e. a state-component. It serves for recording the 'current ' association of names to sets of entity identifiers. The variable and its operations are very similar to those of FLAT_NDB but they have been separated here from t p . r m and r_rm. After that , t pc and known can be defined in terms of esm. Note that ADDES and ADDENT are specific for this peculiar set-orlented type system, which explains why they are present here; they axe too specific for being a par t of the parameter restriction TYPESYSTEM governing the A abstraction of TYPED_RELATION. This concludes our explanation of NORHANS_TYPESYSTEM.

Below, the instantiation of TYPED_RELATION is actually done. The result is named

NORMANS_RELATION. After that the main module NDB is completed. This completion

is a top-layer (or shell) where NORMANS_RELATION is used in a specific way.

LET NORMANS_RELATION : =

APPLY APPLY RENAME SORT Etp TO Esetnm

IN TYPED_RELATION TO NORMANS_FSEL TO NORMANS_TYPESYSTEM;

LET MAPTP := RENAME SORT Item TO Maptp, FUNC a: -> Item TO I_I,

FUNC b: -> Item TO I_M,

FUNC c: -> Item TO M_I, FU~C d: -> Item TO M_M

IN ENUM4;

LET RKEY := APPLY APPLY APPLY RENAME SORT Iteml TO Rnm, SORT Item2 TO Esetnm, SORT Item3 TO Esetnm, SORT Tup TO Rkey, FUNC projl: Tup -> Iteml TO rim, FUNC proj2: Tup -> Item2 TO fs, FUNC proj3: Tup -> Item3 TO ts, FUNC tup : iteml # Item2 # Item3 -> Tup TO mk_Rkey

IN TUP3_SPEC TO NORMANS_SORTS TO NORMANS_SORTS TO NORMANS_SORTS;

LET I~DB :=

EXPORT

SORT SORT SORT SORT PROC PROC PROC PROC

Eid, Rkey, Maptp, Esetnm, ADDES: Esetnm ->, ADDENT: Esetnm # Eid ->, ADDREL: Rkey # Maptp ->, ADDTUP: Eid # Eid # Rkey ->

226

FROM

IMPORT NORMANS_FSEL INT0 IMPORT NORMANS_TYPESYSTEM INTO

IMPORT NORMANS_RELATION INTO IMPORT MAPTP INTO

IMPORT RKEY INTO

CLASS

FUNC mk_Tuple: Eid# Eid -> Tuple

PAR el:Eid, e2:Eid

DEF SOME t:Tuple

( FORALL f:Fsel, e:Eid

(value(t,f) = e <=>

( f = Fs AND e = el OR f = Ts AND e = e2 ) ) )

FUNCmk_Tpm: Esetnm # Esetnm -> Tpm

PAR el:Esetnm, e2:Esetnm

DEF SOME t:Tpm ( FORALL f:Fsel, e:Esetnm

(type(t,f) = e <=>

( T = Fs AND e = el OR

= Ts AND e = e2 ) ) )

FUNC g: Kkey -> GRkey

PAR rk:Rkey

DEF mk_GKkey(mm(rk),mk_Tpm(fs(rk),ts(rk)))

FUNC

PAR

DEF

cony: Maptp -> Norm

ty:Maptp

( ty = I_I ?; ins(mk_Tuple(singleton(Ts),Fs),

singleton(mk_Tuple(singleton(Fs),Ts)))

I ty = M_i ?; singleton(mkTuple(singleton(Fs),Ts))

I ty = I_M 7; singleton(mk_Tuple(singleton(Ts),Fs))

I ty = M_M ?; empty )

PROC

PAR

DEF

ADDREL: Rkey # Maptp ->

rk:Rkey, mtp:Maptp

ADDREL(g(rk),conv(mtp))

PRED PAR DEF

PRED PAR

DEF

pre_ADDREL: Rkey # Maptp rk:Rkey, mtp:Maptp pre_ADDREL(g(rk),conv(mtp))

pre_ADDTUP: Eid# Eid# Rkey fval:Eid, tval:Eid, rk:Rkey

pre_ADDTUP(mk_Tuple(fval,tval),g(rk))

227

PROC ADDTUP: Eid # Eid # Rkey ->

PAR fval:Eid, tval:Eid, rk:Rkey

DEF ADDTUP(mk_Tuple(fval,tval),g(rk))

END {NDB}

We add some explanation to the above formal text. MAPTP introduces the mapping types h l , I:M etc. - j u s t as before. RKEY introduces the sort Rkey whose elements are triples. The function mk_Tuple puts two Eid values together to form a (generalised) tuple. The function mk_Tpm puts two entity set names together to form a generalised product type. The function g transforms an Rkey object (which is a triple) into a GRkey object (which is a pair). The function cony serves to convert a mapping type to its corresponding normMisation principle. We added procedures ADDREL and ADDTUP which work on Eid and Rkey arguments (instead of Tuple and GRkey arguments). We use overloading in the sense that e.g. the name ADDREL is used twice; the two versions can be distinguished by their argument types. Note that NDB

provides ADDES and ADDENT as well since these are available via the import list.

This concludes the second version of Norman's database in COLD-K.

6 D i s c u s s i o n

The decomposition is actually a generalisatlon-specialisation approach. First a generic module TYPED_RELATION was specified, abstracting from tuple-lengths and from type-checking issues. Furthermore this generic module supports a very powerful concept of normalisation principles. Then TYPED_RELATION was instantiated with the details of 2 tuples and a particular state-dependent typing mechanism. The choice for the normalisation principles 1:1, I:M, M:I and M:M was made by adding a layer with more restricted versions of ADDREL and ADDTUP. In this way the database was obtained. So the main structure of the NDB specification can be sketched as follows:

LET TYPED_RELATION :=

LANBDA X : FSEL OF LAMBDA Y : TYPESYSTEM OF ... ;

LET NORNANS_FSEL := ... ;

LET NORMANS_TYPESYSTEM := ...;

LET NORMANS_RELATION :=

APPLY APPLY ... TYPED_RELATION

TO NOKMANS_FSEL TO NORNANS_TYPESYSTEM;

LET NDB :=

EXPORT ... FROM

IMPORT NORNANS_FSEL INTO

IMPORT NORNANS_TYPESYSTEN INTO

IMPORT NORMANS_RELATION INTO

IMPORT ... INTO

CLASS . . . END {NDB}

228

Let us analyse in some detail what causes the problem in appendix C of [3]. The generic module TYPED-RELATION has a formal parameter

tpc: Eidx E tp~ IB,

modelling the general form of a type-check mechanism where Eid is the sort of entities and Etp the sort of entity-types. There is a potential conflict between things which are static and things which are dynamic: the parameter function tpc is just a static function and to instantiate TYPED.RELATION means giving an actual tpc once and for all. However in Norman's approach, there is a dynamic aspect in the typing, since entities belong to named sets, where these sets can grow by means of state-modifying operations. In COLD-K this is no problem, since we can put tpc in a parameter restriction and mark it as variable.

PRED tpc: Eid # Etp VAR

The second potential problem area arises when spanning the state space of the database. C.B. Jones and J.S. Fitzgerald put the two state-components esm (the entity set map) and rm ('relation-info' map) in one and the same module, which provides at the same time for the type system and the actual data in the relations. The approach worked out in the current paper is to split the description of the database into two parts: a part dealing with the the dynamic type system and a part dealing with the actual tuples which are the contents of the database. It is not unusual in database design to keep these parts conceptually separated.

As it turns out, it is also possible in COLD-K to construct a TYPED_RELATION which only presents a kind of algebraic theory of typed relations, rather than a state-based system (as an alternative to the current approach). This would be similar to the Appendices B.1 and C.1 of [3]. However, we tried to have the generic part as large as possible and hence to factor out the state based part of the database as well - as was shown above.

Note that NDB is not parameterised over the base types, unlike the NDB module in Appendix C of [3]. This could be added without problem, but it was left out to keep things simple. In VDM one has to take the base types as parameters, since there is no option to postulate a primitive sort - which, however, is usual in algebraic specifications.

7 Acknowledgements

The author acknowledges the contributions of C.B. Jones and J.S. Fitzgerald to the work presented in this paper. Special thanks go to J.A. Droppert of Philips Research who carefully read earlier versions of the specification and suggested several improvements.

229

References

[1] J.S. Fitzgerald, C.B. Jones. Modularizing the Formal Description of a Database System. In: D. Bjorner, C.A.R. Hoare, H. Langmaack (eds), VDM '90 VDM and Z, - Formal Methods in Software Development, Springer Verlag LNCS 428 pp. 189-210.

[2] C.B. Jones. Systematic software development using VDM, Prentice-Hall Inter- national, ISBN 0-13-880725-6 (1986).

[3] J.S. Fitzgerald, C.B. Jones. Modulaxizing the Formal Description of a Database System. University of Manchester, Technical Report UMCS-90-1-1.

[4] N. Winterbottom and G.C.tt. Sharman. NDB: Non-programmer database facility. Technical Report IBM TR.12.179, IBM Hursley Laboratory, England, September 1979.

[5] IBM. Data Mapping Program: User's Guide. SB11 - 5340.

[s] H.B.M. Jonkers. An Introduction to COLD-K, in" M. Wirsing, J.A. Bergstra (eds), Algebraic Methods: Theory, Tools and Applications, LNCS 394, Springer Verlag pp. 139-205 (1989).

[7] J.A. Bergstra, J. Heering, P. Klint. Module algebra. JACM Vol. 37 (1990) 335- 372.

[8] L.M.G. Feijs, H.B.M. Jonkers, C.P.J. Koymans, G.R. l~nardel de Lavalette. Formal definition of the design language COLD-K. Preliminary Edition, April 1987, ESPRIT document METEOR/tT/PRLE/7.

[9] L.M.G. Feijs. The calculus ATr, in: M. Wirsing, J.A. Bergstra (Eds.), Algebraic Methods: Theory, Tools and Applications, LNCS 394, Springer-Verlag pp. 307- 328 (1989).

230

A C O L D - K standard library of data types

There is a COLD-K standard library of data types containing standard data types such as items, natural numbers, 2-tuples and 3-tuples, finite sets, finite bags, finite sequences and finite maps, and enumerated sorts with one, two, three, four etc. inhabitants. These axe given by class descriptions called ITEM, ITEM1, ITEM2, ITEM3, NAT_SPEC, TUP2-SPEC~ TUP3_SPEC, SET_SPEC, BAG_SPEC, SEQ_SPEC~ MAP_SPEC, ENUMI, ENUM2, ENUM3, ENUM4 etc.

Of course these are formally specified using the language COLD-K itself, using Peano's axioms for the natural numbers and so on. It is not necessary to give all details here and therefore we just summarise the standard library so as to give more or less sufficient information for using it. The class description ITEM is defined as a one-liner

LET ITEM := CLASS SORT Item FREE END

and very much in the same way we have ITEM1 introducing the sort Item1 etc.

The class description NAT_SPEC introduces the sort Nat of natural numbers with constant ze ro , unary functions succ and pred and the binary predicates l s s , l eq , g t r and geq. Furthermore we have the binary functions add, sub, mul, div, mod, exp, log, max and rain.

The class description TUP2_SPEC introduces the sort Tup of 2-tuples (pairs) with constructor cup: I t eml # Item2 -> Tup and projection functions p r o j l and p ro j2 . It is a parameterised class description, whose beginning is as follows.

LET TUP2_SPEC := LAMBDA X : ITEM1 OF LAMBDA Y : ITEM2 OF ...

The class description TUP3_SPEC introduces the sort Tup of 3-tuples (triples) and is very sindlar to TUP2_SPEC.

The class description SET.SPEC introduces the sort Set of finite sets of items with constructors empty, and binary insert ins: Item # Set -> Set. It is a parameterised class description, whose beginning is as follows.

LET SET_SPEC := LAMBDA X : ITEM OF ...

It provides membership test i s _ i n : I tem # Set , element removal rein: I tem # S e t -> Set . The binary functions include union, i s e c t and d i f f . There is a binary predicate subse t and a cardinality operator card. Finally, we added a non-standard function for the sake of this case study:

FUNC singleton: Item -> Set

PAR i:Item

DEF ins(i,empty)

The class description SE[~_SPEC introduces the sort Seq of finite sequences of items with constructors empty, and cons: I tem # Seq -> Seq. It is a parameterised class description, whose beginning is as follows.

231

LET SEQ_SPEC := LAMBDA X : ITEM OF ...

The functions to take sequences apart are called hd and t l . There is a length operator l en and a selection operator s e l : Seq # Nat -> Item.

The class description MAP_SPEC introduces the sort Map of finite maps from Item1 values to Item2 values with constructors empty and add: Map # Item1 # Item2 -> Map. It is a parameterised class description, whose beginning is as follows.

LET MAP_SPEC := LAMBDA X : ITENI OF LAMBDA Y : ITEM2 OF ...

It provides functions such as empty: -> Map, add: Map # Item1 # Item2 -> Map,

app: Map # Iteml -> Item2, and dom: Map -> Setl~ where Set1 is the sort of

finite sets of Iteml. Finally~ we added:

FUNC restrict: Setl # Map -> Map PAR s:Setl,m:Map

DEF SOME p:Map ( FORALL x:Iteml,y:Item2

(is_in(x,s) AND app(m,x) = y <=> app(p,x) = y ) )

The standard module ENUM1 provides the sort I tem and one nullary function a: -> Item. This a is defined, and it is the only object of sort Item. The standard module ENUM2 provides the sort I tem and two functions a: -> I tem and b : -> Item. These a and b are defined and different and they are the only objects of sort Item. ENUM3 has three objects named a, b and c. And so on.

POLAR A Picture-Oriented Language for Abstract Representations

P~.D. van den Bos L.M.G. Feijs Philips Research Laboratories

I~.C. van Ommering Philips Centre for Software Technology

Abstract

Pictures have been used in informal specification methods for years, clarifying textual descriptions. This paper deals with integration of pictorial representations into formal specification techniques. The usage of pictures does not necessarily imply giving up formality, and this is illustrated with the pictorial language POLAR. This pictorial language shows the modular structure of possibly complex software systems. In large industrial applications of formal specification methods the manageability of complexity is a key factor and here the automatic generation of compact, pictorial representations can be employed directly. Although the approach has been carried out for the formal specification language COLD-K, it does not depend on a particular specification language and the ideas behind it can be used for other languages as well. The pictorial language POLAR as discussed in this paper, were found to be very useful in several large-scale case studies.


The importance and attractiveness of pictoriM representations of software design structures have been widely recognized [1]. An important research topic is to link formal specification techniques [2] with pictorial representations. In this paper the pictorial language POLAR will be discussed. POLAR is a pictorial representation of the COLD language [3], which is a formal design language for sequential software systems. The syntax, weU-formedness and semantics of COLD-K are formally defined [4].

Hierarchical (de)composition is an important issue related to the manageability of the software development process [5], [6]. The COLD-K approach to module composition is an algebraic one, based on Class Algebra [7], which is closely related to Module Algebra [8]. In this way modules can be constructed by algebraic operations (import, export, renaming) on other modules. The pictorial language described in this paper serves for presenting modular structures. It is comparable with the pictorial presentation of structured imperative programs by means of Nassi-Schneidermann diagrams. The advantage of our approach is that a POLAR presentation embodies a true abstraction step. It helps in mastering system complexity by providing a high level view on the module structure.

The outline of this paper is as follows. In Section 2 we give an informal presentation of our pictorial language. Section 3 is concerned with the language COLD-K. In Section 4 we introduce the formal definition of our pictorial language - details of which have been put in an appendix. Section 5 presents the tools, our experiences and conclusions. Appendix A gives the BNF grammar of the pictorial language. Appendix B gives the details of the formal definition. Finally Appendix C presents this definition, using the pictorial language itself.

234

2 I n f o r m a l p r e s e n t a t i o n of P O L A R

In this section we will give an introduction to the pictorial language POLAI~, which has been developed to visualise COLD-K modularisation constructs. It can a/so be used for other languages that have similar modularisation facilities.

2.1 P i c t o r i a l r e p r e s e n t a t i o n of m o d u l e s

Hierarchical modular structures can be represented as trees, where the leaves represent the basic modules. Arbitrarily nested structures like algebraic modularisation constructs can also be represented in a compact way by box diagrams. The idea is that box diagrams can be combined to obtain larger box diagrams in a way similar to algebraic construction. The simplest way to construct new box diagrams is by stacking two equally wide diagrams one on another. Furthermore, operators on box diagrams can be represented by shapes which are the complement of the diagrams they work on. Stacking two diagrams is in fact a special case, where the operator itself does not have a pictorial representation. This stacking of diagrams is analogous to ~he concatenation ab, which stands for a . b. Some possible operations on box diagrams are given in figure 1. In this figure the operators have a dark gray shade, whereas their operands are light gray. The language POLAI~ is based on these compact box diagram representations.

::~::~iiiiiii~i:~iiiiiiilili:::~i!I~iii!Niiiii!~/:ii:~ili

Figure h box diagrams

2.2 C o n t e x t - f r e e g r a m m a r o f P O L A R

In this section we will give the context-free grammar of (part of) the pictorial language POLAR. Pictorial grammars can be described using a slightly modified Backus-Naur Form (BNF), where a non-terminal symbol name is represented by [n~6~ , instead of the usual <name>. In this way we supply a two-dimensional shape for each non-terminal. Using this method, we can get a good impression of what the pictorial language looks like, but in some aspects it is not as precise as we would like. For example (relative) sizes and positions within pictures cannot be given in this way. A formal approach which makes use of COLD-K will be discussed in Section 4. A complete BNF definition of POLAR can be found in Appendix A.

The context-free grammar of a part of POLAP~, using this two-dimensional BNF notation is given below. The context-free grammar of the corresponding part of COLD-K is given as well. The non-terminal ~ and <sch6me> are the start symbols of the POLAR grammar and the corresponding fragment of the COLD-K grammar respectively. Identifiers in POLAR have the same textual representation as in COLD-K, which is indicated by the non-terminal <identifier>.

235

abbrevs

I identifier I

: : =

oo~emolPl

scheme21

~

H I abbrevs I

I I

I ~ abbrevs

<scheme>

<abbrevs>

<abbrev>

<name>

<identifier>

CLASS ::= <definitionset>

END

RENAME <renaming> I IN <scheme>

IMPORT <scheme>1 I INT0 <scheme>2

EXPORT <signature> I FROM <scheme>

LAMBDA <name>:<scheme>1 OF <scheme> 2

APPLY <scheme>1 I T0 <scheme> 2

I <abbrevs>; <scheme>

I <identifier>

:= <abbrev>

I <abbrev>;<abbrevs>

::= LET <name> : =<scheme>

::= <identifier>

236

3 The formal language COLD-K

POLAI~ is a pictorial representation of the COLD language, which is a formal design language for sequential software systems. There are several versions, but we will use the kernel language COLD-K. In the latter language one can describe all stages of the design of a software system formally. The syntax, well-formedness and semantics of the language itself are formally defined as well. It offers modularisatlon, abbreviation and parameterisation mechanisms. The approach to modularisation is an algebraic one, which means that new modules can be constructed by algebraic operations on modules. In COLD-K modules are called schemes. External visibility can be controlled by the export operator. Schemes can be given a name, which can be referred to by other schemes. By means of parameterisation, schemes can be made more generic, and therefore reusable.

Below we will give an overview of COLD-K modularisation constructs, followed by an example.

3.1 C O L D - K s t r u c t u r i n g

In the following overview of COLD-K structuring operations, it is assumed that K and Z range over schemes, whereas I~ and p range over signatures and renamings , respectively. X ranges over valid COLD-K identifiers. Next to the short descriptions, the corresponding POLAR. representations are given.

fiat s cheme CLASS ... END I~-~I

The basic COLD-K scheme, not containing other schemes. The contents are not represented in POLAR,

r e n a m i n g RENAME p IN K

Renamings are useful for matching parameter restrictions and actual parameters. Here p describes a mapping from names to names.

K

i m p o r t o p e r a t i o n IMPORT K INT0 L

This operation combines two COLD-K schemes.

e x p o r t o p e r a t i o n EXPORT ~ FROM/(

Part of a scheme can be hidden by the export operation. The signature ~ is a list of names to be exported. The visible signature of the resulting scheme is the intersection of ~, and the signature of K.

237

p a r a m e t e r i s a t i o n LANBDA X : K OF L

COLD-K parameterisat ion is based on a special version of l ambda calculus. A more usual notat ion for this construct is ~X : K • L. Here X is a formal parameter of scheme L. We call K the parameter restriction of the lambda abstract ion.

a p p l i c a t i o n APPLY K TO L

A parameterised scheme K can be instant iated by applying it to an actual parameter L. The usual lambda calculus notat ion is (KL). The parameter L must satisfy the specification as s ta ted by the parameter restriction of K . One can say tha t the actual parameter must be an implementation of the formal one.

a b b r e v i a t i o n LETX : = K ; L

X is defined as a shorthand for K . Whenever we want to use scheme K in L, i t is sufficient to refer to the name X. By using this mechanism COLD-K texts become more readable and schemes can be used more than once.

r e f e r e n c e X

Whenever X is introduced as a module name or has been bound by a LAMBDA, i t can be referenced to.

H

3.2 A n e x a m p l e o f C O L D - K m o d u l a r i s a t i o n

We will give an example of the use of COLD-K modularisat ion constructs. Therefore we show some fragments of COLD-K texts, together with their P O L A R representations. Let us assume tha t we want to specify the functionality of a window manager. This window manager uses a set of windows, so we specify what we understand by a window first. This is done in a scheme named WINDOW. Furthermore we have a parameterised scheme SET_SPEC, defining finite sets. The scheme ITEH, containing one sort I tem, is the parameter restriction of SET_SPEC. Now we can specify a set of windows by instant iat ing SET_SPEC with WINDOW. Because the sort Window from WINDOW must play the role of I t em in SET_SPEC~ we must rename I tem to Window first. The scheme WINDOW_MANAGER uses of this set of windows, by means of the import operation.

238

LET WINDOW :=

CLASS

SORT Window

{ Operation definitions for Window }

{ Window Creation }

{ Window Modification }

{ Button Manipulation } { etc. }

END;

LET ITEM :=

CLASS

SORT Item

END;

LET SET_SPEC :=

LAMBDA X:ITEM 0F

EXPORT

Sort Set,

Sort Item

FROM

IMPORT X INT0 IMPORT NAT_SPEC INT0

CLASS

SORT Set

{ List of definitions for Set of Items }

END;

LET WINDOW_MANAGER :=

IMPORT APPLY RENAME

SORT Item TO Window,

SORT Set TO Windows IN SET_SPEC TO WINDOW INT0 CLASS

{ List of definitions }

WINDOW MANAGER II - , II H ~E~_sp~c I r

END;

239

4 Formal def ini t ion of P O L A R

4.1 I n t r o d u c t i o n

As we saw in 2.2, a pictorial, two-dimensional syntax cannot be described in BNF as precisely as we would like to. POLAtt is a visual representation of COLD-K, a language with formal syntax and semantics, so it is desirable to define POLAI~ in a formal way too. We assume that the reader is familiar with COLD-K, as we will use it for our formal specification of POLAR. Note that COLD-K plays a double role: it is target language as well as recta-language.

Since POLAtt is a pictorial language, the formal definition starts with formallsing the notion of pictures. The concept of bounding boxes plays a central role in the formal definition. A simple example language will be introduced, which will be used throughout this section to explain the method of defining the syntax and semantics of POLAP~. In addition to the context-free grammar, we will give well-formedness rules, putting constraints on the layout, thus defining a subset containing nice-looking pictures. This is comparable with textual grammars, where syntactical correctness does not depend on textual layout. In the definition of the semantics, we will use techniques from denotational semantics. In fact our approach is relational in the sense that we have a 'meaning predicate' rather than a 'meaning function'.

The outline of the remainder of this section is as follows. In Section 4.2 and Section 4.3 we develop a simple theory concerning pictures and their bounding boxes. The Sections 4.4 to 4.8 present the essential aspects of the formal definition using simplified toy languages and using a combination of COLD-K and mathematical notation - by way of syntactic sugar. The complete, syntax-checked formal definition of POLAR using COLD-K is given in an appendix.

4.2 P i c t u r e s

A picture can be represented mathematically by a partial function on (x, y) pairs of real numbers. Let 1R be the set of real numbers and C a set of colours. Every picture p is defined by a function cp : Dp --* C, where Dp C ~v~ × ~ . The function % is undefined for all (x,y) pairs outside Dp. The domain Dp defines the shape of a picture p, whereas the function cp defines its colour at each point. In this view only points in the domain Dp belong to the picture. One could also say that all points (x, y) belong to a picture, points within Dp being %paque' and all other points ' transparent' . In the formal COLD-K definition the latter view will be used. We formally introduce the sort of pictures in our COLD-K spedfication:

SORT Picture

Two pictures p~, P2 are equal iff they have the same shape and have the same colour (at all points), which is defined by the following equation:

Pl = P2 ¢~ Dp, = D R A V(x, y) 6 Dp, (c~ (x, y) = c A (x, y))

The merge operation puts two pictures together to form a new, usually more complex one:

FUNC merge: Picture # Picture -> Picture

Merging pictures can be compared to concatenating strings, with the main difference that the latter is order-preserving, whereas the former is not. Pictures are considered to be locally opaque, so when we merge two pictures, one picture comes in front of the other and we cannot look through the foremost one. The following equations formalise the merge operation for the situation where P3 = merge(pl,p2):

240

D~ = Dp, U D~

V(~, y) e D~I (c~ (~, y) = c~1(~, y))

V(x, y) e D~ \D~ (c~ (x, y) = e . (x, y))

The function merge is associative:

merse(pl,merge(p2,ps)) = merge(merse(pl,p2),p3)

merge is however commutative only in the case where overlapping parts of Pl and P2 are equal:

Essentially this is the content of the COLD-K scheme PICTURES, defined in appendix B. This scheme is parameterised over colours, to keep it generic, although for POLAI~ only black- and-white pictures are used. Therefore in the POLAR definition the scheme PICTURES is instantiated by the scheme BLACK_WHITE, where there are precisely two distinct values of the sort Colour.

4.3 B o u n d i n g boxes

Pictures as defined in the COLD-K scheme PICTURES can still have all possible shapes. To simplify the notion of positions and sizes of pictures it would be nice to deal with rectangular shapes only. We define therefore a bounding box of a picture as the smallest of the class of enclosing boxes of that picture. An enclosing box of a picture p is a kind of virtual box with the property that all points outside are transparent in p. In our COLD-K definition we introduce the sort of boxes first:

SORT Box

Boxes can be defined as 4-tuples (w, h, x, y), where w and h stand for the width and height of the box respectively. We define a box to be va l i d iff both w and h are non-negative real numbers. The x and y components are interpreted as the coordinate pair (z, y) of the bottom-left corner of the box in the ~ x ~/plane.

Next we introduce a predicate ebox: Box # Picture . This defines the class of enclosing boxes for a given picture p, as discussed above. We assume that outs ide defines which coordinate pairs are outside a given Box and that opaque defines the opacity of a given Picture .

PRED ebox: Box # Picture PAR b: Box, p: Picture DEF valid(b) AND

FORALL x: Real, y: Real ( outside(b, x, y) => NOT opaque(p, x, y) )

Finally the function bbox: P ic ture -> Box is defined, yielding the smallest box satisfying the predicate ebox. The predicates higher and wider indicate whether the first bounding box is wider or higher respectively than the second one.

241

FUNC bbox: Picture -> Box PAR p: Picture DEF SOME b: Box (

ebox(b, p) AND FORALL c: Box (

ebox(c, p) AND N0T b = c ffi> higher(c, b) OR wider(c, b) )

The size and position of a picture can thus be defined as the size and position of its bounding box. Now we can say, for example, that two pictures are on top of each other if their bounding boxes are.

4.4 A s i m p l e l a n g u a g e

We will explain the method used to define the formal syntax and semantics of POLAR with the help of a simple example language. The grammar of this example language, which we call TROPICAL, has ~ as its start symbol. The syntax is as follows:

[ module ::= ~

[ module

I module

The language TROPICAL is a pictorial representation of another toy language called WARM, to be defined below. Modularisation in WARM is done in an algebraic way, similar to the COLD-K approach. The abstract syntax of the part of the language, dealing with modularisation is as follows:

<module> ::= <basic> I import(<module>, <module>)

<basic> ::= basic(<definitions>)

In 4.8 we will define the semantics of TROPICAL in terms of WARM. Note that WARM plays the role of COLD-K as target language, whereas TROPICAL is the counterpart of POLAR.

4.5 T e r m i n a l s y m b o l s

In the definition of the context-free grammar of POLAR we will abstract from the precise representations of the terminal symbols. There are two reasons for this: we want to leave some freedom in the actual representations and besides that, it is almost impossible to define them with the right measure of strictness. Therefore, we do not define the representation of the terminal symbols formally at all, but state which terminals must be present. We will allow some

242

extensions to COLD-K (in its role as a meta-language), using pictorial names for predicates. In this way the names of the predicates give a strong suggestion for the representation of the symbols - which is not formal. Examples of pictorial predicate names are:

'~' m s

PRED ) ,' : Picture

PRED : Picture

The first predicate is supposed to be true for pictures that represent the Greek letter F, surrounded by some white space. The dashed box is not part of the picture~ but indicates the "bounding" box. The second predicate defines pictures representing boxes. In this case the bounding box is not given explicitly, but it is assumed that there is no white space outside the black border.

The scheme describing the terminal symbols of our example language TROPICAL is defined by SYMBOLS, where we assume that the theory concerning pictures and their bounding boxes developed in Section 4.2 and Section 4.3, has been put in BBOX_PICTURES.

LET SYMBOLS := IMPORT (APPLY BBOX_PICTURES TO BLACK_WHITE) INT0 CLASS

PRED ~ : Picture

END;

4.6 C o n t e x t - f r e e g r a m m a r

In addition to terminal symbols, a context-free grammar contains a set of non-terminal symbols, These non-fermiuM symbols are inductively defined by predicates. An analogous approach to describing non-terminals by predicates is discussed in [9].

In the formal definition of the context-free grammar of TROPICAL we abstract from the representation of the terminal symbols by means of the COLD-K parameterisation mechanism. We use the scheme SYMBOLS, as discussed in the previous section, as a parameter restriction. The predicate bas i c : P i c tu re defines the representation of a basic module. An auxiliary predicate import : P i c tu r e # P ic tu re defines the preconditions under which two pictures can be merged into an import picture. The predicates over and ontop define the relative positions of two pictures. See appendix B for their definitions. The predicate module : P i c tu r e defines inductively the class of module pictures.

LET TROPICAL_CFG := LAMBDA X: SYMBOLS OF IMPORT X INT0 CLASS

243

PRED basic: Picture PAR p: Picture

PRED PAR DEF

import: Picture # Picture pl: Picture, p2: Picture module(pl) AND module(p2) AND over(pl, p2) AND ontop(pl, p2)

PKED

IND

module: Picture FORALL p: Picture (

b a s i c ( p ) => module(p) ) ; FORALL pl: Picture, p2: Picture (

import(pl, p2) => module(merge(pl, p2)) )

END;

The definition of the context-free grammar still depends on the actual representation of the terminal symbols. We will give a possible COLD-K definition of that representation, which is in fact an implementation of SYMBOLS. By overloading there are two predicates called ins ide . The first i n s i d e serves to compare two boxes whereas the second i n s i d e checks whether a given box contains a given coordinate pair. Similarly the predicate o u t s i d e checks whether a given box does not contain a given coordinate pair. Finally we use an obvious on predicate. They are defined in Appendix B.

LET MY_SYMBOLS := IMPORT (APPLY BBOX_PICTURES TO BLACK_WHITE) INTO

IMPORT REAL_SPEC INTO CLASS

PRED : Picture

PAR DEF

p: Picture EXISTS b: Box (

inside(b, bbox(p)); FORALL x: Real, y: Real (

inside(bbox(p), x, y) OR on(bbox(p), x, y) =>

( outside(b, x, y) <=> colour(p, x, y) = black; NOT colour(p, x, y) = black <=>

colour(p, x, y) = white) ) )

END;

By application we get a concrete syntax based on our definitions for the representations of the terminal symbols:

244

LET MY_CFG := APPLY TROPICAL_CFG TO MY_SYMBOLS;

4.7 W e l l - f o r m e d n e s s

The set of well-formed pictures is a subset of the set of pictures generated by the context- free grammar. The predicates defining well-formed pictures are stronger than those defining syntactically correct pictures. We will require, for instance, two module pictures to be equally wide when stacking them. The COLD-K definition of well-formedness for our example language TROPICAL is as follows, assuming that samewidth is a predicate stating whether two pictures (i.e. their bounding boxes) have the same width:

LET TROPICAL.WF := LAMBDA X: SYMBOLS OF IMPORT (APPLY TROPICAL_CFG TO X) INTO CLASS

PRED is_wf: Picture IND FORALL p: Picture (

basic(p) => is_wf(p) ); FORALL pl: Picture, p2: Picture (

import(pl, p2) AND is_wf(pl) AND is_wf(p2) AND samewidth(pl, p2) => is_wf(merge(pl, p2)) )

END;

4.8 Semantics

The semantics of POLAR with respect to COLD-K are defined in the form of relational semantics by means of predicates. This method has the a~Ivantage that loss of information by visual abstraction can be described elegantly, as we sha/l see. Continuing our example, we shall define the semantics of TROPICAL in terms of its target language WARM. As we are only interested in the fragment of the language dealing with the modular structure and not in the contents of basic modules, we abstract from the non-terminal <definitions>. We model this non-terminal symbol as a primitive sort in the scheme PRIM, which will be used as a parameter restriction for the context-free grammar of WARM.

LET PRIM := CLASS

SORT Definitions

END;

The scheme WARM_CFG describes the abstract syntax of the target language WARM. The sort Module is defined inductively and consists of basic modules and modules which are the result of imports. To keep things simple, we have omitted the 'no confusion' axioms that are needed here (we did the same in Appendix B). It is a routine matter to add them.

245

LET WARM_CFG := LAMBDA X: PRIM OF IMPORT X INTO

CLASS

SORT Module

FUNC basic: Definitions -> Module FUNC import: Module # Module -> Module

AXIOM FORALL d: Definitions ( basic(d)! ); FORALL m1: Module, m2: Module (

import(ml, m2)! )

PRED is_gen: Module IND FORALL d: Definitions (

is.gen(basic(d)) ); FORALL ml: Module, m2: Module (

is_gen(ml) AND is_gen(m2) => is_gen(import(ml, m2)) )

AXIOM FORALL m: Module (is_gen(m) )

END;

The relational semantics of the pictorial language TI~OPICAL in terms of the target language WARM can be described in COLD-K as follows:

LET SEMANTICS := LAMBDA X: SYMBOLS OF

LAMBDA Y: PRIM OF IMPORT (APPLY TROPICAL.CFG TO X) INTO IMPORT (APPLY WARM_CFG TO Y) INTO CLASS

PRED sam: Picture # Module IND FORALL p: Picture, d: Definitions (

basic(p) => sem(p, basic(d)) ); FORALL pi: Picture, p2: Picture,

ml: Module, m2: Module ( impor~(pl, p2) AND sem(pl, ml) AND sem(p2, m2) => som(merge(pl, p2), import(ml, m2)) )

END;

A basic module picture in TROPICAL has the semantics of any basic module in the target language WAItM. There is loss of information in the visual representation of the target language. By means of relational semantics it is also possible to describe certain ambiguities in the visual representation. An example is the stacking of three boxes. Suppose that the following three

246

relations exist:

sem(~-a-~, A) sem(~ , B)

then the following two relations can be derived:

sem(~, import(A, import(B, C)))

sem(~, import(import(A, B), C))

s e m ( ~ , c)

In the pictorial representation an ambiguity is introduced, whichis reflected in the semantics relation. Fortunately, in this case the ambiguity of stacking boxes does no harm since the import operation is associative.

5 Tools, experiences and conclusions

An interactive tool generating POLAR pictures from COLD-K texts has been developed in the framework of the Integrated COLD Environment (ICE). It is implemented in POP-11 using the POPLOG environment, running on a SUN 3/50 workstation. See appendix C for a POLAR version of the formal POLAI~ specification, generated by this tool. A prototype of an interactive editor combining POLAR. pictures and COLD-K texts has been made using the GENESIS environment by L. Helmink and M. van Tien.

The pictorial representation by POLAR has been used to represent the modular structure of COLD-K texts of up to 5000 lines and it has proven to be helpful to users. The syntax and semantics of POLAR. have been defined formally, using the formal design language COLD-K as meta-language. We consider this method of defining pictorial syntax formally to be generic, so it should be fairly straightforward to define other pictorial languages in the same way. POLAR. is a language for visuMising the module structure of descriptions written in the formal design language COLD-K. We expect however that - with minor adaptations - it will also work for algebraic specification languages such as CLEAI~ [10]. It certainly also applies to Middelburg's VVSL, where the COLD-K modularisation and parameterisation constructs have been put on top of VDM [11].

References

[1] D. Harel. On visual formalisms. Communications of the ACM, Vol. 31, no. 5 (1988).

[2] D. Bj6rner, C.B. Jones (eds.). The Vienna Development Method: the meta-language. Lecture Notes in Computer Science 61, Springer-Verlag (1978).

[3] H.B.M. Jonkers. An Introduction to COLD-K, in: M. Wirsing, J.A. Bergstra (eds), Alge- braic Methods: Theory, Tools and Applications, LNCS 394, Springer Verlag pp. 139-205 (1989).

[4] L.M.G. Feijs, H.B.M. Jonkers, C.P.J. Koymans, G.It. l~enardel de Lavalette. Formal Defi- nition of the Design Language COLD-K, ESPI~IT document METEOI~/T7/PI~LE/7 (rev. edition 1989).

247

[5] D.L. Parnas. On the Criteria to be used in decomposing systems into modules. CACM 15 (Dec 1972), 840-841.

[6] It. Ehrig, H. Weber. Programming in the Large with Algebraic Module Specifications. Information Processing 86, tI.-J. Kugler (ed.), Elsevier Science Publishers B.V. (North- Holland).

[7] L.M.G. Feijs, H.B.M. Jonkers, J.H. Obbink, C.P.J~ Koymans, G.R. Renardel de Lavalette, P.H.R.odenburg. A survey of the design language COLD. ESPI~IT '86, Status Report of Continuing Work, The commission of the European Communities (Editors), Elsevier Science Publishers B.V.. (North-Holland), 631-644.

[8] J.A. Bergstra, J. Heering, P. Klint. Module algebra. JACM Vol. 37 (1990) 335-372.

[9] E. W. Dijkstra. Formal Techniques and Sizeable Programs. Selected Writings on Comput- ing: A Personal Perspective, Springer Verlag, ISBN 0 387 906525.

[10] I~.M. Burstall, J.A. Goguen. An informal introduction to specifications using CLEAR, in: 1~. Boyer and J. Moore (eds.) The correctness problem in computer science, Academic Press, ISBN 0-12-122920-3 (1981).

[11] C.A. Middelburg. The VIP VDM specification language, in: R. Bloomfield, L. Marshall, 1~. Jones (eds.) VDM '88, VDM - the way ahead, pp. 187-201, Springer Verlag LNCS 328 (1988).

248

A P O L A R in B N F

I noe ,,eot,,,er Jl

I scheme] ::= ~

l I scheme I renamin,g,,,

Scheme I scheme

iiii'~'~'~'~'~':~n:~u~T:i ii lii~ii~ ............. ............................. ,Jill ~ii I so~.. p!~,i~', I i~I

I named_scheme_seq I i°o~o-I

, f~-eo~,,ierf I

I named_scheme_seq

I named_scheme ]

renaming

signature I

249

: : = named_scheme

' named-scheme I i named-sch,em e-seq I

name ::=

scheme

I I p, : : = i ',

i !_o_i I renaming ! renaming L

::= :Z: L__J

::= [~?

I signature ' " signature

L~J I signature signature

I ~,n '\"{'E t')-' . . . . : ', ! I ! ! I s ature , , , item , ~ ~'

I I signature .... renaming

p - - . ~

~I ] I ' ' , , scheme i !

250

I component I ::= name

scheme

name

I s cheme

I scheme ,

251

B POLAR in COLD

DESIGN

#include REAL_SPEC #include CHAR #include QUADRUPLE

#include SEq_SPEC

LET COLOUK := CLASS SORT Colour FREE END;

LET PICTURES := LAMBDA X: COLOUR OF

IMPORT X INTO IMPORT REAL_SPEC INTO CLASS

SORT Picture

PRED opaque: Picture # Real # Real

FUNC colour: Picture # Real # Real -+ Colour

AXIOM FORALL p: Picture, x: Real, y: Real (

opaque(p, x, y) ¢==~ colour(p, x, y)! )

AXIOM FORALL p: Picture ( EXISTS x: Real, y: Real ( opaque(p, x, y) ) )

AXIOM FORALL pl: Picture, p2: Picture (

FORALL x: Real, y: Real (

opaque(pl, x, y) ¢=~ opaque(p2, x, y);

opaque(pl, x, y)

colour(pl, x, y) = colour(p2, x, y) ) pl = p2 )

FUNC merge: Picture # Picture -+ Picture PAR pl: Picture, p2: Picture DEF SOME p: Picture (

FORALL x: Real, y: Real (

opaque(p1, x, y) OR opaque(p2, x, y) 4==~ opaque(p, x, y);

opaque(p1, x, y) =~ co lour (p , x, y) = co lou r (p l , x, y ) ; opaque(p2, x, y) AND NOT opaque(pl , x, y)

co lour (p , x, y) = cotour(p2, x, y) ) )

FUNC merge: Picture # Picture # Picture -* Picture

PAR pl: Picture, p2: Picture, p3: Picture DEF merge(pl, merge(p2, p3))

FUNC merge: Picture # Picture # Picture # Picture -+ Picture

252

PAR pl: Picture, p2: Picture, p3: PicZure, p4: Picture DEF mer£e(pl, merge(p2, p3. p4))

FUNC merge: Picture # Picture # Picture # Picture # Picture -+ Picture

PAR p1: Picture, p2: Picture, p3: Picture, p4: Picture, pS: Picture

DEF merge(pl, merge(p2, p3, p4, p5))

FUNC meres: Picture # Picture # Picture # Picture # Picture # Picture -+ Picture

PAR p1: Picture, p2: Picture, p3: Picture, p4: Picture, pS: Picture, p6: Picture

DEF merEe(pl, merge(p2, p3, p4, p5, p6))

END;

LET BOX :=

IMPORT SORT SORT SORT SORT SORT FUNC FUNC

FUNC FUNC

APPLY APPLY APPLY APPLY RENAME Quadruple TO Box, Iteml TO Real, Item2 TO Real, Item3 TO Real, Item4 TO Real, quadruple: Iteml # Item2 # Item3 # Item4 -+ Quadruple TO box, projl: Quadruple -+Iteml TO w,

proj2: Quadruple -+ Item2 TO h, proj3: Quadruple -~ Items TO x,

FUNC proj4: Quadruple -~ Item4 TO y IN QUADRUPLE TO REAL.SPEC TO REAL_SPEC TO REAL_SPEC TO REAL.SPEC INTO IMPORT REAL_SPEC INTO

CLASS

Validity of Boxes

PRED valid: Box PAR b: Box DEF Eeq(w(b), O) AND geq(h(b), O)

position of (x, y) pairs in relation to Boxes

PRED inside: Box # Real # Real PAR b: Box, x: Real, y: Real DEF valid(b) AND

gtr(x, x(b)) AND gtr(add(x(b), w(b)), x) AND gtr(y, y(b)) AND gtr(add(y(b), h(b)), y)

PRED outside: Box # Real # Real PAR b: Box, x: Real, y: Real DEF valid(b) AND

253

gtr(x(b), x) OR gtr(x, add(x(b), w(b))) OR gtr(y(b), y) OR gtr(y, add(y(b), h(b)))

PRED on: Box # Real # Real PAR b: Box, x: Real, y: Real DEF valid(b) AND

NOT ( inside(b, x, y) OR outside(b, x, y) )

Comparison between sizes of two Bounding Boxes

PRED PAR DEF

sameheight: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND h(bl) = h(b2)

PRED samewidth: Box # Box PAR bl: Box, b2: Box

DEF valid(bl) AND valid(b2) AND w(bl) = w(b2)

PRED PAR DEF

wider: Box # Box b1: Box, b2: Box valid(bl) AND valid(b2) AND gtr(w(bl), w(b2))

PRED PAR DEF

higher: Box # Box b1: Box, b2: Box valid(b1) AND valid(b2) AND gtr(h(bl), h(b2))

Comparison between positions of two Bounding Boxes

PRED PAR DEF

above: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND

gtr(y(bl), add(y(b2), h(b2)))

PRED PAR DEF

ontop: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND y(bl) = add(y(b2), h(b2))

PRED PAR DEF

v_partover: Box # Box

bl: Box, b2: Box valid(bl) AND valid(b2) AND gtr(y(bl), y(b2)) AND gtr(add(y(b2), h(b2)), y(bl)) AND gtr(add(y(bl), h(bl)), add(y(b2), h(b2)))

PRED PAR DEF

v_overlap: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND

254

geq(y(b2), y(bl)) AND geq(add(y(bl), h(bl)), add(y(b2), h(b2)))

PRED PAR DEF

rightof: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND gtr(x(bl), add(x(b2), w(b2)))

PRED PAR DEF

against: Box # Box bi: Box, b2: Box valid(bl) AND valid(b2) AND x(bl) = add(x(b2), w(b2))

PRED PAR DEF

h_partover: Box # Box bi: Box, b2: Box valid(bl) AND valid(b2) AND

gtr(x(bl), x(b2)) AND gtr(add(x(b2), w(b2)), x(bl)) AND

gtr(add(x(bL), w(bl)), add(x(b2), w(b2)))

PRED h_overlap: Box # Box PAR b1: Box, b2: Box DEF valid(bl) AND valid(b2) AND

geq(x(b2), x(bl)) AND geq(add(x(bl), w(bl)), add(x(b2), w(b2)))

PRED PAR DEF

inside: Box # Box bl: Box, b2: Box valid(bl) AND valid(b2) AND

gtr(x(bl), x(b2)) AND gtr(y(bl), y(b2)) AND gtr(add(x(b2), w(b2)), add(x(bl), w(bl))) AND gtr(add(y(b2), h(b2)), add(x(bl), h(bl)))

END;

LET BBDX_PICTURES :=

LAMBDA X: COLOUR OF IMPORT APPLY PICTURES TO X INTO IMPORT BOX INTO IMPORT REAL_SPEC INTO CLASS

PRED ebox: Box # Picture PAR b: Box, p: Picture DEF valid(b) AND

FORALL x: Real, y: Real ( outside(b, x, y) $ NOT opaque(p, x, y) )

FUNC bbox: Picture -+ Box PAR p: Picture

255

DEF

PRED PAR DEF

PRED PAR DEF

PRED PAR DEF

SOME b: Box ( ebox(b, p) AND FORALL c: Box (

ebox(c, p) AND NOT b = c higher(c, b) OR wider(c, b) ) )

sameheight: Picture # Picture pl: Picture, p2: Picture sameheight(bbox(pl), bbox(p2))

samewidth: Picture # Picture pl: Picture, p2: Picture same~idth(bbox(pl), bbox(p2))

higher: Picture # Picture pl: Picture, p2: Picture higher(bbox(pl), bbox(p2))

PRED wider: Picture # Picture PAR pl: Picture, p2: Picture DEF wider(bbox(pl), bbox(p2))

PRED PAR DEF

PRED PAR DEF

PRED PAR DEF

PRED PAR DEF

P~D PAR DEF

PRED PAR DEF

PRED PAR DEF

PRED PAR

above: Picture # Picture pl: Picture, p2: Picture above(bbex(pl), bbox(p2))

ontop: Picture # Picture pl: Picture, p2: Picture ontop(bbox(pl), bbox(p2))

v_partover: Picture # Picture pl: Picture, p2: Picture v_partover(bbox(pl), bbox(p2))

v_overlap: Picture # Picture pl: Picture, p2: Picture v-overlap(bbox(pl), bbox(p2))

rightof: Picture # Picture pl: Picture, p2: Picture rightof(bbox(pl), bbox(p2))

against: Picture # Picture pl: Picture, p2: Picture against(bbox(pl), bbox(p2))

h_partover: Picture # Picture p1: Picture, p2: Picture h-partover(bbox(pl), bbox(p2))

h_overlap: Picture # Picture pl: Picture, p2: Picture

DEF h_overlap(bbox(pl), bbox(p2))

256

PRED inside: Picture # Picture PAR pl: Picture, p2: Picture DEF inside(bbox(pl), bbox(p2))

PKED over: Picture # Picture PAR pl: Picture, p2: Picture

DEF ( above(pl, p2) OR ontop(pl, p2) ) AND ( h_overlap(pl, p2) OR h_overlap(p2, pl) )

PKED nextto: Picture # Picture PAR pl: Picture, p2: Picture DEF ( rightof(pl, p2) OR against(pl, p2) )

AND ( v_overlap(pl, p2) OR v_overlap(p2, pl) )

PRED covers: Picture # Picture

PAR pl: Picture, p2: Picture DEF bbox(pl) = bbox(p2)

END;

LET BLACK-WHITE :=

CLASS

SORT Colour

FUNC black: -+ Colour

FUNC white: -* Colour

AXIOM black!; white!

AXIOM NOT ( black = white )

PRED is_gen: Colour IND is_gen(black); is-gen(white)

AXIOM FORALL c: Colour (is_gen(c) )

END;

LET STRING :=

APPLY RENAME SORT IZem TO Char, SORT Seq TO String

IN SEQ.SPEC TO CHAR;

LET IDENTIFIER :=

IMPORT STRING INTO

257

CLASS

PRED identifier: String

END;

LET BW_BBOX_PICTURES : =

APPLY BBOX.PIC~S TO BLACK_WHITE;

LET SYMBOLS :=

IMPORT STRING INT0

IMPORT BW_BBOX_PICTURES INT0

CLASS

PP~D : Picture

m PRED m : Picture

PRED ~ : Picture

PRED ~ : Picture

PRED ~ : Picture

I i ,p, PRED: ,' : Picture

i i

PRED i O ' : : Picture

I i

PRED I ,' : Picture

PRED 1U' , : Picture I I

'N' PItED I ' , : Picture

e

'\' I i PRED, , : PicCure

f '{l * i

PP~D: , : Picture

[); PRED, , : P i c t u r e

258

! I

PRED ', $ I : Picture i |

PRED string : Picture # String

END;

LET POLAR-CFG :=

LAMBDA X: SYMBOLS OF LAMBDA Y: IDENTIFIER OF IMPORT X INTO IMPORT Y INTO IMPORT APPLY RENAME SORT Item TO Picture, SORT Seq TO PictureList

IN SEQ_SPEC TO BW_BBOX-PICTURES INTO CLASS

Z Rectangles

PRED rectangle: Picture PAR p: Picture

DEF ~ (p)

OR ~ (p)

OR ~ (p)

Names

PRED name: Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture, s: String (

stri~ s) AND identif______imr(s)

AND I~I (p2> AN0 I (pa> AND inside(pl, p2) AND insids(p2, pS)

name(pl, p2, p3) )

PRED name: Picture IND FGEALL pl: Picture, p2: Picture, p3: Picture (

name(pl, p2, p3) ~ name(merge(pl, p2, p3)) )

7. Scheme Language

7.- Schemes

PRED class : Picture PAR ~ u r e

12,X,I

259

7. - Renaming

PRED renaming: Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture (

renaming(pl) AND scheme(p2) AND I l(p3)

AND inside(pl, p3) AND nextto(p3, p2) AND against(p3, p2) => renaming(pl, p2, p3) )

7. - Import

PRED import: Picture # Picture IND FORALL p1: Picture, p2: Picture (

scheme(p1) AND scheme(p2) AND over(pl, p2) AND ontop(pl, p2)

import(pl, p2) )

7. - E x p o r t

PRED export: Picture # Picture # Picture # Picture IND FORALL p1: Picture, p2: Picture, p3: Picture, p4: Picture (

signature(pl) AND scheme(p2)

AND I i(p3) AND rectangle(p4)

AND inside(p1, p3) AND over(p3, p2) AND above(p3, p2) AND inside(merge(p3, p2), p4)

export(pl, p2, p3, p4) )

- Abstraction

PRED abstraction: Picture # Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture (

nams(pl) AND scheme(p2) AND scheme(p3) AND rectangle(p4) AND nextto(pl, p2) AND against(pi, p2) AND h_partover(p3, merge(pl, p2)) AND above(merge(pl, p2), p3) AND covers(p4, merge(pl, p2, p3))

abstraction(pl, p2, p3, p4) )

7. -Application

PRED application: Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture (

scheme(pl) AND scheme(p2) AND rectangle(p3) AND h_partover(pl, p2) AND above(pl, p2) AND covers(p3, merge(pl, p2))

application(pl, p2, p3) )

X - Abbreviation

PRED abbreviation: Picture # Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture (

named_scheme_seq(pl) AND scheme(p2)

260

A N D ~ (PS)AND I I (P4) AND inside(pl, p3) AND inside(p2, p4) AND over(pS, p4) AND ontop(pS, p4)

abbreviation(pl, p2, p3, p4) )

- Reference

PRED reference: Picture # Picture IND FORALL p1: Picture, p2: Picture, s: S t ~

string(p1, s) AND identifier(s) AND I 1

AND inside(pl, p2) reference(p1, p2) )

(p2)

- Schemes

FRED scheme: Picture IND FORALL p: Picture (

class(p) ~ scheme(p) 1; FORALL pl: Picture, p2: Picture, p3: Picture (

renaming(pl, p2, p3) ~ scheme(merge(pl, p2, p3)) ); FORALL pl: Picture, p2: Picture (

import(pl, p2) ~ scheme(merge(pl, p2)) ); FORALL pl: Picture, p2: Picture, pS: Picture, p4: Picture (

export(p1, p2, p3, p4) =~ scheme(merge(pl, p2, p3, p4)) );

FDRALL p1: Picture, p2: Picture, pS: Picture, p4: Picture ( abstraction(p1, p2, pS, p4) =~ scheme(merge(pl, p2, pS, p4)) );

FORALL p1: Picture, p2: Picture, p3: Picture ( application(pl, p2, pS) =~ scheme(merge(pl, p2, p3)) );

FORALL p1: Picture, p2: Picture, p3: Picture, p4: Picture ( abbreviation(p1, p2, p3, p4)

scheme(merge(pl, p2, p3, p4)) ); FDRALL pl: Picture, p2: Picture (

reference(pl, p2) =~ scheme(merge(pl, p2)) )

Named Schemes

PRED named.scheme: Picture # Picture IND FORALL pl: Picture, p2: Picture (

name(pl) AND scheme(p2) AND over(p1, p2) AND ontop(pl, p2) =~ named_scheme(pl, p2) )

FRED named-scheme: Picture IND FORALL p1: Picture, p2: Picture (

named-scheme(pl, p2) =~ named_scheme(merge(pl, p2)) )

261

PRED named_scheme.seq: Picture # Picture IND FORALL pl: Picture, p2: PicZure (

named-scheme(p1) AND named.scheme_seq(p2) AND nextto(p2, pl) ~ named_scheme_seq(pl, p2) )

PRED named- scheme-seq : P i c t u r e IND FORALL p : P i c t u r e (

named_scheme(p) => named-scheme_seq(p) ) ; FORALL pl: Picture, p2: Picture (

named-scheme.seq(pl, p2) =~ named-scheme_seq(merge(pl, p2)) )

Y. - Renamings

- Constant

PRED rho: Picture PAR ~_~ Picture

* l

DEF iP! (P)

X - C o m p o s i t i o n

PRED composition: Picture # Picture # Picture IND FORALL p1: Picture, p2: Picture, p3: Picture (

F--] renaming(pl) AND renaming(p2) AND !01 (p3)

AND nextto(p3, pl) AND nextto(p2, p3) composition(pl, p2, p3) )

Z - Renamings

PRED renaming: Picture IND FORALL p: Picture (

rho(p) ~ renaming(p) ); FORALL p1: Picture, p2: Picture, p3: Picture (

composition(pl, p2, p3) renaming(merge(pl, p2, p3)) )

~.- Items

PRED item: Picture PAR ~- Picture

E t

,~, (p) DEF , , I I

- Signatures

Y. - Constant Signature

PRED sigma: Picture PAR p: Picture

262

(p>

- Renaming

PILED renaming: Picture # Picture IND FORALL pl: Picture, p2: Picture (

renaming(pl) AND signature(p2) AND nextto(pl, p2) renaming(p1, p2) )

- Union

PRED union: Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture (

r - - .

signature(p1) AND signature(p2) AND ::U:: (p3) AND nextto(p3, pl) AND nextto(p2, p3)

union(pl, p2, p3) )

- Intersection

PRED intersection: Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture (

signature(pl) AND signature(p2) AND ::N:: (p3> AND nextto(p3, pl) AND nextto(p2, p3) $ intersection(pl, p2, p3) )

- Deletion

PRED deletion: Picture # Picture # Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture,

p4: Picture, p5: Picture ( item(_~) AND sign_~ture(p2) . . . . { '}' ,\,

. . . . : ', ( p 5 ) AND.__., ,, (p3) AND . . . . ,, ,, (p4) AND : :

AND n e x t t o ( p 5 , p2) AND n e x t t o ( p 3 , p5) AND n e x t t o ( p l , p3) AND n e x t t o ( p 4 , p l )

deletion(pl, p2, p3, p4, pS) )

- Scheme

PRED scheme: Picture # Picture # Picture IND FORALL p1: Picture, p2: Picture, p3: Picture (

scheme(pl) AND siEma(p2) AND rectangle(p3) AND nsxtto(pl, p2) AND inside(merge(pl, p2), p3)

scheme(pl, p2, p3) )

- Signatures

PRED signature: Picture IND FORALL p: Picture (

sigma(p) ~ signature(p) );

263

FORALL pl: Picture, p2: Picture ( renaming(pl, I)2) ~ signature(merge(pl, p2)) );

FOKALL pl: Picture, p2: Picture, p3: Picture ( union(p1, p2, p3) ~ signature(merge(pl, p2, p3)) );

FORALL pl: Picture, p2: Picture, p3: Picture ( intersection(pl, p2, p3) ~ signature(merge(pl, p2, p3)) );

FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture, p5: Picture (

deletion(p1, p2, p3, p4, pS) signature(merge(pl, p2, p3, p4, p5)) );

FORALL pl: Picture, p2: Picture, p3: Picture ( scheme(pl, p2, pS) ~ signature(merge(pl, p2, p3)) )

Design Language

- Components

- Abbreviated

PRED abbreviated: Picture # Picture IND FORALL pl: Picture, p2: Picture (

name(pl) AND scheme(p2) AND over(p1, p2) AND ontop(pl, p2) =~ abbreviated(pl, p2) )

- S p e c i f i e d

PRED s p e c i f i e d : P i c t u r e # P i c t u r e # P i c t u r e IND FORALL p l : P i c t u r e , p2: P i c t u r e , p3: P i c t u r e (

name(p1) AND scheme(p2) AND (p3)

AND inside(p2, p3) AND over(pl~) AND ontop(pl, p3) =~ specified(p1, p2, pS) )

- Implemented

PP~ED implemented: Picture # Picture # Picture # Picture # Picture IND FORALL pl: Picture, p2: Picture, p3: Picture,

p4: Picture, p5: Picture ( name(pl) AND scheme(~D scheme(p3)

AND - (p4> A.D l l(ps> AND inside(p2, p4) AND inside(p3, pS) AND over(pl, p4) AND over(pl, p4) AND ontop(p4 , p5) AND over (p4 , p5) =~ implemented(p1, p2, p3, p4, p5) )

- Components

PRED component: Picture IND FORALL p1: Picture, p2: Picture (

abbreviated(pl, p2) =~ component(merge(pl, p2)) );

264

FORALL pl: Picture, p2: Picture, p3: Picture ( specified(pl, p2, p3) :ee component(merge(pl, p2, p3)) );

FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture, pS: Picture (

implemented(pl, p2, p3, p4, pS) component(merge(p1, p2, p3, p4, p5)) )

- Component List

PRED componentlist: PictureList IND componentlist(empty);

FORALL pl: Picture, p2: PictureList ( component(pl) AND componentlist(p2) componentlist(cons(pl, p2)) )

Z - Scheme List (System)

PRED schemelist: PictureList IND schemelist(empty);

FORALL pl: Picture, p2: PictureList ( scheme(pl) AND schemelist(p2) schemelist(cons(pl, p2)) )

END;

LET POLAR_WF :=

LAMBDA X: SYMBOLS OF LAMBDA Y: IDENTIFIER OF IMPORT APPLY APPLY POLAR-CFG TO X TO Y INTO CLASS

Z Names

PRED wf_name: Picture IND FORALL pl: Picture, p2: Picture, p3: Picture (

name(pl, p2, p3) ~ wf_name(merge(pl, p2, p3)) )

Z Scheme Language

~- Schemes

PRED wf_scheme: Picture IND FORALL p: Picture (

class(p) ~ wf_scheme(p) ); FORALL pl: Picture, p2: Picture, p3: Picture (

renaming(pl, p2, p3) AND wf-renaming(pl) AND wf.scheme(p2) AND samehei~t(p2, p3) ~ wf.scheme(merge(pl, p2, p3)) );

FORALL p1: Picture, p2: Picture ( import (pl, p2) AND wf_scheme(pl) AND wf_scheme(p2)

285

AND samewidth(pl, p2) ~ wf_sch~e(merge(pl, p2)) ); FORALL pl: Picture, p2: Picture. p3: Picture, p4: Picture (

export(pl, p2, p3, p4) AND ~ n a t u r e ( p l ) AND wf_scheme(p2)

A.D ) il <p4> wf-scheme(merge(pl, p2, p3, p4)) );

FORALL p1: Picture, p2: Picture, p3: Picture, p4: Picture ( abstraction(p1~ p2~ p3, p4) AND wf.name(pl) AND wf-scheme(p2) AND wf_schame(p3) AND sameheight(pl, p2) AND samewidth(merge(pl, p2), p3)

AND ~ (p4)

wf-scheme(merge(pl, p2, p3, p4)) ); FORALL p1: Picture, p2: Picture, p3: Picture (

application(pl, p2~ p3) AND wf_schsme(pl) AND wf_scheme(p2)

AND ~ (p3) AND samewidth(pl, p2) ~ wf.scheme(merge(pl, p2, p3)) );

FORALL p1: Picture, p2: Picture, p3: Picture, p4: Picture ( abbreviation(pl, p2, p3, p4) AND wf-named_scheme.seq(pl) AND wf_schame(p2) AND samewid~h(p3, p4)

wf_scheme(merge(pl, p2, p3, p4)) ); FORALL p1: Picture, p2: Picture (

reference(pl ~ p2) =~ wf-scheme(merge(pl, p2)) )

- Named Schemes

PRED wf-named_scheme: Picture IND FORALL pl: Picture, p2: Picture (

named_scheme (pl, p2) AND wf.namQ(pl) AND wf_scheme(p2) AND samswidth(pl, p2) ~ wf.named_scheme(merge(pl, p2)) )

PRED wf-named.scheme_seq: Picture IND FORALL p: Picture (

wf-named_scheme(p) ~ wf-named_scheme_seq(p) ) ; FORALL pl: Picture, p2: Picture (

named-scheme_seq(p i, p2) AND wf-named-scheme(pl) AND wf-named.scheme.seq(p2)

wf-named-schems_seq(mergs(pl, p2)) )

~ - R e n ~ i ~ s

PRED ~-renaming: Picture IND FORALL p: Picture (

rho(p) ~ wf_renaming(p) ); FORALL pl: Picture, p2: Picture, p3: Picture (

composition(pl, p2j p3) AND wf_renaming(pl) AND wf_renaminE(p2)

266

AND against(p3, pl) AND against(p2, p3) wf_renaming(merge(pl, p2, p3)) )

- Items

PRED wf_item: Picture IND FORALL p: Picture (

item(p) ~ wf_item(p) )

- Signatures

PRED wf_signature: Picture IND FORALL p: Picture (

sigma(p) ~ wf_signature(p) ); FORALL pl: Picture, p2: Picture, p3: Picture (

renaming(pl, p2) AND wf_renaming(pl) AND wf_signature(p2) AND against(p1, p2) ~ wf_signature(merge(pl, p2)) );

FORALL p1: Picture, p2: Picture, p3: Picture ( union(pl, p2, p3) AND wf_signature(pl) AND wf_signature(p2) AND against(p3, pl) AND against(p2, p3)

wf_signature(merge(pl, p2, p3)) ); FORALL pl: Picture, p2: Picture, p3: Picture (

intersection(pl, p2, p3) AND wf_signature(pl) AND wf_signature(p2) AND against(p3, pl) AND against(p2, p3)

wf_signature(merge(pl, p2, p3)) ); FORALL pl: Picture, p2: Picture, p3: Picture,

p4: Picture, pS: Picture ( deletion(p1, p2, p3, p4, p5) AND wf_item(pl) AND wf_signature(p2) AND samewidZh(p3, p4) AND sameheight(p3, p4) AND against(p5, p2) AND against(p3, pS) AND against(pl, p3) AND against(p4, pl)

wf_signature(merge(pl, p2, p2, p3, p4)) ); FOKALL pl: Picture, p2: Picture, p3: Picture (

scheme(pl, p2, p3) AND sameheight(pl, p2)

wf_signature(merge(pl, p2, p3)) )

Design Language

- Components

PRED wf_component: Picture IND FORALL pl: Picture, p2: Picture (

abbrevia~ed(pl, p2) AND wf_name(pl) AND wf_scheme(p2) AND samewidth(pl, p2) ~ wf_component(merge(p2, p2)) );

FORALL pl: Picture, p2: Picture, p3: Picture (

267

specified(pl, p2, p3) AND wf_name(pl) AND wf_scheme(p2) AND samewidth(pl, p3) =~ wf_component(merge(pl, p2, pS)) );

FORALL pi: Picture, p2: Picture, p3: Picture, p4: Picture, pS: Picture ( implemented(pl, p2, p3, p4, pS) AND wf_name(pl) AND wf_scheme(p2) AND wf_scheme(p3) AND samewidth(pl, p4) AND samewidth(p4, pS)

wf_component(merge(pl, p2, p3, p4, p5)) )

% - Component List

PRED wf_componentlist: PictureList IND wf_componentlist(empty);

FORALL pl: Picture, p2: PictureList ( wf_component(pl) AND wf_componentlist(p2) =~ wf_componentlist(cons(pl, p2)) )

X- Scheme List (System)

PRED wf_schemelist: PictureList IND wf_schemelist(empty);

FORALL p1: Picture, p2: PictureList ( wf_scheme(pl) AND wf_schemelisZ(p2)

wf_schemelist(cons(pl, p2)) )

END;

LET PRIMITIVES :=

CLASS

SORT DefinitionSet SORT PairSet SORT ItemSet SORT Item

END;

LET SCHEME :=

LAMBDA X: IDENTIFIER OF LAMBDA Y: PRIMITIVES OF IMPORT X INTO IMPORT Y INTO CLASS

SORT SchemeName

FUNC scheme: S t r i n g -~ SchemeName

268

AXIDM FORALL s: String (identifier(s) ~ scheme(s)! )

PRED is_gen: SchemeName IND FORALL s: String (identifier(s) ~ is.gen(scheme(s)) )

AXIOM FORALL n: SchemeName (is.gen(n) )

SORT Scheme SORT Renaming SDRT Signature

- Schemes

FUNC class: DefinitionSet -* Scheme FUNC renaming: Renaming # Scheme -+ Scheme FUNC import: Scheme # Scheme -~ Scheme FUNC export: Signature # Scheme -+ Scheme FUNC abstraction: SchemeName # Scheme # Scheme -* Scheme FUNC application: Scheme # Scheme -+ Scheme FUNC abbreviation: SchemeName # Scheme # Scheme -+ Scheme

FU~(C reference: SchemeNams -* Scheme

AXIOM FORALL d: DefinitionSet ( class(d) : ) ;

FORALL r: Renaming, s: Scheme ( renaming(r, s)' ) ;

FORALL st: Scheme, s2: Scheme ( import(sl, s2)! );

FORALL S: Signature, s: Scheme ( export(S, s)! );

FORALL n: SchemeName, sl: Scheme, s2: Scheme ( abstraction(n, sl, s2) ! );

FORALL st: Scheme, s2: Scheme ( application(sl, s2): ) ;

FORALL n: SchemeName, st: Scheme, s2: Scheme ( abbreviation(n, sl, s2)! );

FORALL n: SchemeName ( reference(n) ! )

PRED is.gen: Scheme IND FORALL d: DefinitionSet (

is_gen(class(d)) ) ; FORALL r: Renaming, s: Scheme (

is_gen(s) ~ is.gen(renaming(r, s)) ); FORALL st: Scheme, s2: Scheme (

is_gen(sl) AND is.gen(s2) =~ is.gen(import(sl, s2)) );

FDRALL S: Signature, s: Scheme ( is-gen(s) =~ is_gen(export(S, s)) );

FORALL n: SchemeName, st: Scheme, s2: Scheme ( is_gen(sl) AND is.gen(s2)

269

=~ is_gen(abetraction(n, sl, s2)) ); FORALL st: Scheme, s2: Scheme (

is_gen(sl) AND is_gen(s2) is_gen(application(sl, s2)) );

FORALL n: SchemeName, st: Scheme, s2: Scheme ( is_gen(sl) AND is_gen(s2)

is.gen(abbreviation(n, sl, s2)) ); FORALL n: SchemeName (

is_gen(reference(n)) )

AXIOM FORALL s: Scheme (is_gen(s) )

Y, - Renamings

FUNC constant: PairSet -~ Renaming FUNC composi¢ion: Renaming # Renaming -~ Renaming

AXIOM FORALL p: PairSet ( constant(p)! );

FORALL rl: Renaming, r2: Renaming ( composition(rl, r2)! )

PRED is_gen: Renaming IND FORALL p: PairSet (

is_gen(constant(p)) ); FORALL rl: Renaming, r2: Renaming (

is_gen(rl) AND is_gen(r2) is.gen(composition(rl, r2)) )

AXIOM FORALL r: Renaming (is_gen(r) )

- Signatures

FUNC constant: ItemSet -~ Signature FUNC renaming: Renaming # Signature -+ Signature FUNC union: Signature # Signature -+ Signature FUNC intersection: Signature # Signature -+ Signature FUNC deletion: Item # Signature -+ Signature FUNC scheme: Scheme -+ Signature

AXIOM FORALL i: ItemSet ( constant(i)! );

FORALL r: Renaming, S: Signature ( renaming(r, S)! );

FORALL SI: Signature, $2: Signature ( union(St, S2)! );

FORALL SI: Signature, $2: Signature ( intersection(Sl, $2)! );

FORALL i: Item, S: Signature ( deletion(i, S)! );

FORALL s: Scheme ( scheme(s)! )

270

PRED is-gen: Signature IND FOKALL i: ItemSet (

is_gen(constant(i)) ); FORALL r: Renaming, S: Signature (

is.gen(S) =~ is_gen(renaming(r, S)) ); FORALL Sl: Signature, S2: Signature (

is_gen(S1) AND is_gen(S2) is_gen(union(Sl, S2)) );

FORALL SI: Signature, S2: Signature ( is_gen(S1) AND is_gen(S2) =~ is.gen(intersection(S1, $2)) );

FORALL i: Item, S: Signature ( is_gen(S) ~ is_gen(deletion(i, S)) );

FORALL s: Scheme ( is.gen(scheme(s)) )

AXIOM FORALL S: Signature (is_gen(S) )

END;

LET COMPONENT :=

LAMBDA X: IDENTIFIER OF LAMBDA Y: PRIMITIVES OF IMPORT APPLY APPLY SCHEME TO X TO Y INTO CLASS

SORT Component

FUNC abbreviated: SchemeName # Scheme -~ Component FUNC specified: SchemeName # Scheme -+ Component FUNC implemented: SchemeName # Scheme # Scheme -~ Component

AXIOM FORALL n: SchemeName, s: Scheme ( abbreviated(n, s)! ) ;

FORALL n: SchameName, s : Scheme ( specified(n, s)! );

FORALL n: SchemeName, st: Scheme, s2: Scheme ( implemented(n, sl, s2)! )

PRED is_gen: Component IND FORALL n: SchemeName, s: Scheme (

is.gen(abbreviated(n, s)) ) ; FORALL n: SchemeName, s: Scheme (

is_gen(specified(n, s)) ) ; FORALL n: SchemeName, sl: Scheme, s2: Scheme (

is.gen(implemented(n~ sl, s2)) )

AXIOM FORALL c: Component (is-gen(c) )

END ;

271

LET COLDK_CFG :=

LAMBDA X: IDENTIFIER OF

LAMBDA Y: PRIMITIVES OF IMPORT APPLY RENAME

SORT Item TO Scheme, SORT Seq TO SchemeList

IN SEQ_SPEC TO (APPLY APPLY SCHEME T0 X T0 Y) INT0

IMPORT APPLY RENAME SORT Item TO Component, SORT Seq TO ComponentList

IN SEQ_SPEC TO (APPLY APPLY COMPONENT T0 X TO Y) INT0 IMPORT APPLY APPLY COMPONENT TO X T0 Y INT0 CLASS

SORT Design

FUNC design: ComponentList # SchemeList -~ Design AXIOM FORALL c: ComponentList, s: SchemeList (

design(c, s)! )

PKED is-gen: Design IND FORALL c: ComponentList, s: SchemeList (

is_gen(design(c, s)) )

AXIOM FOKALL d: Design (is_gen(d) )

END;

LET SEMANTICS :=

LAMBDA X: SYMBOLS OF

LAMBDA Y: IDENTIFIER OF LAMBDA Z: PRIMITIVES OF

IMPORT APPLY APPLY PDLAR-CFG TO X T0 Y INT0 IMPORT APPLY APPLY COLDK_CFG TO Y T0 Z INT0 CLASS

Identifiers

PRED sem: Picture # String

IND FORALL p: Picture, s: String ( string(p, s) AND identifier(s) ~ sem(p, s) )

Y, Names

PRED sem: Picture # SchemeName IND FORALL pl: Picture, p2: Picture, p3: Picture, s: String (

name(p1, p2, p3) AND sem(pl, s) sam(merge(pl, p2, p3), scheme(s)) )

Z Scheme Language

272

- Schemes

PRED sem: Picture # Scheme IND FORALL p: Picture,

d: DefinitionSet ( class(p) =~ sem(p, class(d)) );

FORALL pi: Picture, p2: Picture, p3: Picture, r: Renaming, s: Scheme (

renaming(pl, p2, p3) AND sem(pl, r) AND sem(p2, s)

sem(merEe(pl, p2, p3), renaming(r, s)) ); FORALL pl: Picture, p2: Picture,

sl: Scheme, s2: Scheme ( import (pl, p2) AND sem(pl, sl) AND sere(p2, s2) =~ sem(merge(pl, p2), import(s1, s2)) );

FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture, S: Signature, s: Scheme (

export(p1, p2, p3, p4) AND sem(pl, S) AND sem(p2, s)

sem(merge(pl, p2, p3, p4), export(S, s)) ); FORALL p1: Picture, p2: Picture, p3: Picture, p4: Picture,

n: SchemeName, sl: Scheme, s2: Scheme ( abstraction(pl, p2, p3, p4) AND sem(pl, n) AND sem(p2, sl) AND sem(p3, s2)

sem(merge(pl, p2, p3, p4), abstraction(n, sl, s2)) ); FORALL pl: PicZure, p2: Picture, p3: Picture,

sl: Scheme, s2: Scheme ( application(pl, p2, p3) AND sem(pl, sl) AND sem(p2, s2)

sem(merge(pl, p2, p3), application(sl, s2)) ); FOKALL pl: Picture, p2: Picture, p3: Picture,

p4: Picture. pS: Picture, n: SchemeName, sl: Scheme, s2: Scheme (

abbreviation(merge(pl, p2), p3, p4, p5) AND named_scheme(pl, p2) AND sem(pl, n) AND sem(p2, sl) AND sam(p3, s2)

sem(merge(pl, p2, p3, p4, p5), abbreviation(n, sl, s2)) );

FORALL pl: Picture, p2: Picture, p3: Picture, p4: Picture, p5: Picture, p6: Picture, n: SchemeName, sl: Scheme, s2: Scheme (

named_scheme-seq(merge(pl, p2), p3) AND named_scheme(p1, p2) AND abbreviation(merge(pl, p2, p3), p4, p5, p6) AND sem(pl, n) AND sem(p2, sl) AND sem(merge(p3, p4, p5, p6), s2)

sem(merge(pl, p2, p3, p4, p5, p6), abbreviation(n, sl, s2)) );

FORALL pl: Picture, p2: Picture, s: String ( reference(pl, p2)

273

AND sem(pl, s) =~ sem(merge(pl, p2), reference(scheme(s))) )

Z- Renamings

PRED sem: Picture # Renaming IND FORAIL pl: Picture, p2: PairSet (

rho(pl) ~ sem(pl, constant(p2)) ); FORALL pi: Picture, p2: Picture, p3: Picture.

ri: Renaming, r2: Renaming ( composition(pl, p2, p3) AND sem(pl, rl) AND sem(p2, r2)

sem(merge(pl, p2, p3), composition(r1, r2)) )

X- Items

PRED sem: Picture # Item IND FORALL p: Picture, i: Item (

item(p) ~ sem(p, i) )

Z- Signatures

PRED sem: Picture # Signature IND FORALL p: Picture, i: ItemSet (

sigma(p) =~ sem(p, constant(i)) ); FORALL pl: Picture. p2: Picture. r: Renaming. S: Signature (

renaming(pl, p2) AND sem(pl, r) AND sem(p2. S)

sem(merEs(pl, p2). renaming(r. S)) ); FORALL pl: Picture, p2: Picture, p3: Picture,

SI: Signature , $2: Signature ( union(pl, p2, p3) AND sem(pl. SI) AND sem(p2. $2) =~ sem(merge(pl, p2, p3). union(St. S2)) );

FORALL pl: Picture. p2: Picture. p3: Picture, SI: Signature, $2: Signature (

intersection(pl, p2, p3) AND sem(pl, SI) AND sem(p2. 32)

sem(merge(pl, p2. p3), intersection(S1, $2)) ); FORALL pl: Picture. p2: Picture, p3: Picture,

p4: Picture. pS: Picture, i: Item, S: Signature (

deletion(pl, p2. p3. p4, pS) AND sem(pl, i) AND Sem(p2,.S) =~ sem(merge(pl, p2, p3. p4, p5), deletion(i. S)) );

FORALL p1: Picture. p2: Picture. p3: Picture. s: Scheme ( schsme(pl, p2. p3) AND sem(pl, s) =~ sem(merEe(pl, p2, p3). scheme(s)) )

~, Design Language

274

- Components

PRED sem: Picture # Component IND FORALL pl: Picture, p2: Picture,

n: SchemeName, s: Scheme ( abbreviated(pl, p2) AND sem(pl, n) AND sem(p2, s)

sem(merge(pl, p2), abbreviated(n, s)) ); FORALL pl: Picture, p2: Picture, p3: Picture,

n: SchemeName, s: Scheme ( specified(pl, p2, p3) AND sem(pl, n) AND sem(p2, s)

sem(merge(pl, p2, p3), specified(n, s)) ); FORALL pl: Picture, p2: Picture, p3: Picture,

p4: Picture, p5: Picture, n: SchemeName, sl: Scheme, s2: Scheme (

implemented(pl, p2, p3, p4, pS) AND sem(pi, n) AND sem(p2, el) AND sem(p3, s2)

sem(merge(pl, p2, p3, p4, pS), implemented(n, el, s2)) )

Y, - Componen~ list

PRED sem: PictureList # ComponentList IND sem(empty: PictureList, empty: ComponentList);

FORALL pl: Picture, p2: PictureList, ci: Component, c2: ComponentList (

component(pl) AND componentlist(p2) AND sem(pl, cl) AND sem(p2, c2)

sem(cons(pl, p2), cons(cl, c2)) )

- Scheme L i s t ( S y s t e m )

PRED sem: PictureList # SchemeList IND sem(empty: PictureList, empty: SchemeList);

FORALL pl: Picture, p2: PictureList, el: Scheme, s2: SchemeList (

scheme(pl) AND schemelist(p2) AND sem(pl, el) AND sem(p2, s2)

sem(cons(pl, p2), cons(sl, s2)))

END

SYSTEM POLAR-CFG, POLAR-WF, SEMANTICS

275

C P O L A R , in P O L A R ,

PICTURES

x-Ilco~oo.~

I.E x , AL_SPEC

I[ BBOX_PICTURES

SEq_SPEClPI IDENTIFIER

STRING

> , ~ I1 BBOX PICTURES I

i" ...... ,,,.M.o~s l] .................... STRING J

POLAR._CFG

SYMBOLS

I Y I ,OENT, F, ER

X

Y

~1 s~sPEc If' BW BSOX_PICTURES

276

tl"",~,'r,w~ OOLOK_~F~ II

Inheritance in COLD

H.B.M. Jonkers*

Philips Research Laboratories P.O. Box 80000, 5600 JA Eindhoven, The Netherlands

Abstract

In this paper we inclicate how a general inheritance mechanism can be defined as a form of syntactic sugar on top of the design kernel language COLD-K. The inheritance mechanism goes beyond that of traditional object-oriented languages in that it applies to single-sorted, dynamic classes as well as to many-sorted and static classes. It will be incorporated in a user-oriented language version of COLD defined on top of COLD-K, thus providing full support for the methodology of object- oriented design. The mechanism is believed to be applicable to other languages as well.


COLD (Common Object-oriented Language for Design) is a formal design language that has been developed at Philips Research Laboratories Eindhoven in the framework of ESPRIT project 432 (METEOR). It is a system design language, which means that it is intended for the description of systems in intermediate stages of their design. As such it can be viewed as an integrated combination of a specification and programming language, supporting advanced techniques for design in the large.

In COLD we distinguish between two hierarchical language levels, identified as the kernel language and the user language. The kernel language COLD-K [5,7] is a relatively small language that provides the basic linguistic primitives necessary to obtain the required expressivity of a design language. Due to the minimal nature of its language constructs, the kernel language is not particularly user-friendly. The user language is intended to remedy this by providing a substantial amount of syntactic sugar that will make life easy for the user, without introducing any new semantic features. Such a user language is currently being developed in the framework of ESPRIT project 2565 (ATMOSPHERE).

Though the meaning of the acronym COLD contains the word %bject-oriented', it may be argued that COLD is not an object-orlented language since the kernel language lacks one of the basic trademarks of object-orientedness: inheritance [12]. In this paper we show how a general inheritance mechanism can be defined in terms of the constructs of COLD-K without requiring any semantic extensions of the language. This mechanism

*This work l~as been performed in the framework of ESPRIT project 432 (METEOR).

278

will be incorporated as syntactic sugar in the user-oriented version of COLD currently being developed.

The inheritance mechanism presented in this paper goes beyond that of traditional object-oriented languages in that it applies to single-sorted, dynamic classes as well as to many-sorted and static classes. Among other things, this makes it possible to treat 'objects' and 'values' completely uniformly (as is already the case in COLD-K). The mechanism is also simple, as indicated by the fact that it can be characterised very directly in algebraic terms. Though we define the mechanism in terms of the constructs of COLD-K, we believe that the applicability of the mechanism is not restricted to COLD and could be used in other languages as well.

Section 2 is concerned with a discussion of the class concept of COLD and the general notion of inheritance. The basic inheritance mechanism is introduced in Section 3, starting with a definition of the specific requirements that we impose on it. In Section 4 various refinements of the mechanism are discussed, such as multiple inheritance, redefinition and dynamic binding. Though a number of the COLD-K language constructs used in the paper will be explained on the fly, a certain familiarity with COLD-K may be helpful for reading the paper. For an introduction to COLD-K, the reader is referred to [7].

2 Concepts

2 . 1 T h e n o t i o n o f c l a s s

For a good understanding of the rest of this paper it is necessary to start with a discussion of the class concept of COLD, which is somewhat more general than that of traditional object-oriented languages. In COLD a clear distinction is made between a class and its description, called a class description, which can be viewed as a (possibly algorithmic) specification of a class. A class is a semantic object that can be viewed as an abstract machine with a collection of states and one initial state. The special thing about classes is that their states are many-sorted algebras. This implies that each class (description) has a fixed set of names associated with it, called its state signature. The state signature of a class consists of:

1. Sort names: In each state a sort name denotes a sort, which is a collection of objects. The objects are said to exist in the given state.

2. Predicate names: In each state a predicate name denotes a predicate, which is an ordinary mathematical predicate defined on the sorts of the given state.

3. Function names: In each state a function name denotes a function, which is a partial function from sorts to sorts of the given state.

The state of a class can be changed by means of procedures, which are identified by a fixed set of names, called procedure names, associated with the class. Each procedure name denotes a (possibly nondeterministic) state transformer, which may take objects as its parameters and return objects as its result. The procedure names together with the state signature constitute the class signature, or signature, of the class. Predicates, functions and procedures take and return objects of fixed sorts, indicating that COLD

279

is a strongly typed language. Overloading of operations is allowed.

Procedures transform states by modifying the sorts, predicates and functions that constitute the state. Modification of a sort amounts to the creation of new objects of that sort, thus extending the collection of existing objects of that sort. Once created an object can never be deleted, so sorts can only grow. Modification of a predicate or function amounts to modifying its result value for certain (and possibly all) arguments. Combinations of modifications of sorts, predicates and functions are possible. So a procedure may create a new object of a certain sort and at the same time modify a function.

As a very simple example consider the class of ~counters', described by the COLD-K class description COUNTER below. In this class description we have omitted everything but the signature information:

LET COUNTER :=

EXPORT

SORT

SORT

PROC

PROC

PROC

FROM

IMPORT

CLASS

SORT

FUNC

PROC

PROC

PROC

END

Int,

Counter,

new : -> Counter,

incr : Counter -> Int,

decr: Counter -> Int

INT INTO

Counter ...

value : Counter -> Int ...

new : -> Counter ...

incr : Counter -> Int ...

deer : Counter -> Int ...

There is a sort Counter of counter objects that can be created dynamically by means of the procedure new. Each counter has an internal 'counter value' associated with it, represented by the integer-valued function va lue . This value can be incremented and decremented by means of the procedures i n c r and decr , which return the new counter value as their result. The value of a counter is considered internal to the counter, hence the function va lue is not exported by COUNTER. The imported class description INT is supposed to contain the definition of the integers (sort I n t with associated operations).

Ignoring inheritance for the moment, the definition of the above class in a TOL (Tra- ditional Object-oriented Language, e.g. Smalltalk, C + + , Eiffel) would read something like this:

280

CLASS Counter

EXPORT

incr. decr

IMPORT

Int

FEATURE

VAR value : Int

ROUTINE new

METHOD incr : -> Int

METHOD decr: -> Int

END

@ I @

The following differences of the COLD and TOL notions of class can now be noted:

1. In a TOL the sort name Counter is identified with the class name, making 'class' synonymous with 'sort ' (or ' type') . In COLD these are two different things, allowing more than one sort to be defined in a class. In other words, COLD classes are 'many-sorted'.

2. In a TOL instance variables are viewed as attr ibutes of objects. In COLD these are represented by (variable) functions, which are more general than attributes. For example, both instance and class variables can be modelled by functions.

3. In a TOL the distinction is made between two different kinds of procedures, denoted as ~routines' and ~methods' above. Methods have an implicit first argument, the type of which is omitted in their definition (since it is equal to the name of the class). In COLD such a distinction is not necessary, since all arguments of procedures are explicit.

4. In a TOL, the new operation is associated with and implicitly exported by every class, implying that all sorts are dynamic. In COLD, new is an ordinary procedure that may or may not be introduced and/or exported, thus allowing the definition of static sorts as well.

Apart from this, there are some differences of notation with respect to the use of instance variables and methods. For example, a call of the method i n c r of counter c would in a TOL typically be written as c . i n c r , while the definition of the method i n c r could read something like this:

METHOD incr : -> Int

BODY value <- value + 1

RETURN value

In COLD-K the call c . i n c r would be written as i n c r ( c ) and the definition of the procedure i n c r would be writ ten (with some syntactic sugar) as:

PROC incr : Counter -> Int

PAR self : Counter

DEF value(self) <- value(self) + i;

value(self)

281

The above shows that the TOL notion of class is in fact a special case of the COLD notion of class. Additional constraints connected with TOLs, such as the fact that objects of the same sort may not access each other's internal variables, can also be enforced by simple syntactic restrictions. The specialisation of TOLs makes it possible to use a more compact notation than in COLD-K (in particular, by avoiding the explicit s e l f parameters). From the COLD point of view, however, these are considerations pertaining to the user language level.

The notion of class underlying COLD has a number of advantages over the classical notion of class. Apart from being more general, it allows a uniform treatment of values and objects in a natural way, and makes it possible to integrate algebraic techniques in an object-oriented framework. Furthermore, the simplicity of the approach to inheritance presented in this paper critically depends on the underlying notion of class. This does not imply that the approach cannot be used with a TOL, but there a number of restrictions have to be imposed that directly derive from the restrictions imposed on classes (as seen from the COLD point of view).

2.2 The n o t i o n of | n h e r l t a n c e

Let us assume that in some TOL, we have a class defined as follows:

CLASS V

FEATURE VAR f : W

METHOD g : V -> V ...

* o .

END

As discussed in the previous section, this corresponds to the following class description in COLD-K:

CLASS

SORT V

FUNC f

PROC g

END

. . o

: V - > W . . . : V # V - > V . . .

The definition of a new class T 'by inheritance' of the class V in a TOL could have the following form:

CLASS T INHERIT V

END

where new methods are defined at the dots, possibly in terms of the inherited operations. The exact syntax and meaning of the above may differ from language to

282

language, but the general idea is that it defines a class with objects of sort T tha t have the ' same ' operations as objects of sort V associated with them, plus a number of additional operations (introduced at the dots). This is the extension aspect of inheritance. So, put in terms of COLD-K, we can view the above as the definition of a class of the following form:

CLASS

SORT T ...

FUNC f : T -> W . . . PROC g : T # V -> V ...

END

where the operations f and g should be defined in such a way tha t they are the ' same' as in the class V. Furthermore, the sort T defined this way is usually considered a subset of V: all objects of sort W are also objects of sort V. This is the specialisation aspect of inheritance: objects of sort T may be more specific than those of sort V, implying two things. Firstly, specialised operations on objects of sort T may be defined that are not meaningful for objects of sort V in general. Secondly, some of the operations on objects of sort V can be redefined in a more efficient way when they are restricted to objects of sort T. We shall come back to this in Sections 3.1 and 4.4.

Inheritance, as exemplified above, is usually considered an operation on classes (or class descriptions, in COLD terms). Inheriting a class implies inheriting a 'package' consisting of the sort and operations defined in the class. In order to show how this form of inheritance can be defined in terms of the constructs of COLD-K, we shall first consider inheritance at a finer level of granularity. Tha t is, we consider inheritance of individual sorts and operations. For tha t purpose, we shall introduce a new kind of definition of sorts and operations, called an inherited definition, that can be expanded directly to COLD-K. In Section 4.3 we discuss how the 'class inheritance' mechanism can be defined in terms of the more basic inherited definitions.

3 T h e bas ic m e c h a n i s m

3 . 1 R e q u i r e m e n t s f o r a n i n h e r i t a n c e m e c h a n i s m

In the previous section we informally discussed the notion of inheritance. Unfortunately there is no generally agreed precise definition of inheritance, let alone one that covers inheritance in the context of strongly typed languages. In order to clarify our view with respect to inheritance, we shall formulate and explain two strong requirements to be satisfied by an inheritance mechanism, k s we shall show, the inheritance mechanism described in this paper satisfies these requirements. To a certain extent it might even be said tha t the inheritance mechanism follows from them. In Section 4.4 some other notions of inheritance are discussed (most of them not satisfying the requirements).

We shall use the notat ion u _< v to indicate that the sort or operat ion u is defined either directly or indirectly by inheritance f rom another sort or operat ion v. In other words, _< is the reflexive and transitive closure of the direct inherltance relation. The

283

two requirements we shall impose are the following:

1. Sort inheritance requirement: The fact that a sort T is defined by inheritance of a sort V (i.e. T < V) should imply that T can be seen as a subset of V.

2. Operation inheritance requirement: The fact that an operat ionfZ : Tll # . . . # T~ -> Vl # . . . # V~ is defined by inheritance of an operation f 9 : T~ # . . . # T~ -> V~ # . . . # V2n (i.e. fx_< f2) should imply that:

(a ) k = m , = n; 2 I 1 2. (b) _< TL. . . , _< Tin, < vL. . . , v, _< v,,

(c) f l is semantically equivalent to the operation f2 restricted to the domain rl # . . . # r L

Here semantic equivalence of two functions, predicates or procedures implies that they yield the same values when given the same arguments in the same state. In addition, semantic equivalence of two procedures implies that they also have the same side effects.

The first requirement may seem obvious, since the fact that T < V should imply that we can use any object x :T in a context where an object of sort V is expected. Yet it has some implications, such as the fact that the inheritance relation on sorts should be a partial order. This is something that can be enforced in a simple way by forbidding circular inherited definitions. (Another implication of this requirement will be seen in Section 4.1.)

The second requirement is less obvious and is explained by the fact that we want to view inheritance as a semantic concept, while still supporting all the classleal object- oriented features connected with inheritance such as code sharing, redefinition and dynamic binding. In contrast, inheritance in most existing object-oriented languages is an implementation-oriented concept (aimed at code sharing) with a usually rather unclear semantics. Par t (c) of the operation inheritance requirement reflects this semantic view of inheritance and will be referred to in the sequel as the requirement of semantics preservation. It makes more precise what we meant in the previous section by the fact that an operation defined by inheritance should be the 'same' as the original operation.

Tha t the classical features of object-oriented programming can be supported is indicated by the following two facts which follow from the requirements:

1. Any call of f l can be replaced by a calI of f2. This implies that for the code of f l we can simply use the code of f2 ('code sharing').

2. Any call of f2 with an argument in T~ # . . . # T~ can be replaced by a call o f f 1. This implies tha t by providing a mechanism to ~redefine' the operation f l , the language implementation can dynamically select the most efficient implementation of a call of f2 ( 'dynamic binding').

284

T

V

T

P"O

Figure 1: Inheritance of V by T

3.2 I n h e r i t a n c e o f s o r t s

As can be inferred from the sort inheritance requirement, defining a sort T by inheritance of a sort V should amount to defining T as a subset, or subsort, of V. Since there is no notion of subsorting in COLD-K, this cannot be expressed directly in COLD-K. Therefore, an 'inherited sort definition' is introduced, which has the following form:

SORT T IS V

This definition defines sort T by inheritance of sort V and is an abbreviation for the following COLD-K definitions:

SORT T DEP V

FUNC i : T -> V

AXIOM FOI~LL x:T.y:T ( i ( x ) = i (y ) <=> x = y )

Here i is a unique name that is generated automatically. These definitions define the sort T as being dependent on the sort V, imply|ng that T will not change unless V changes. If V is static (= constant), this means that T is also static. If V is dynamic (= variable), it means that the creation of objects of sort T is always accompanied by the the creation of objects of sort V, but not necessarily the other way around.

The fact that T is a 'subsort' of V is expressed by the definition of the function i from T to V and the associated axiom expressing that i is an injection. This function will be called the embexlding from T to V. It indicates how objects of sort T should be identified with those of V (see Figure 1). An essential point here is that i is defined as a constant function (in the COLD sense): it implies that once an object x of sort T has been 'identified' with the object i (x), this relation will never change. Of course, if V is dynamic, the creation of objects of sort V may lead to the creation of objects of sort T, thus extending the domain of the function i, but that does not affect the value of i for the 'old' objects of sort T (due to the way in which modification of functions is defined in COLD).

285

We note that the above inherited definition states only that sort T can be interpreted as a subsort of sort V, nothing more and nothing less. It does not say anything about the actual set of objects of sort V that ~belong' to sort T. This is something that can be controlled by other means (such as by inheriting operations or adding axioms).

From the user's point of view it may seem a nuisance that he has to write explicit applications of the embedding i , such as in i ( x ) , to indicate tha t he means the object x :T interpreted as an object of sort V. In Section 4.2 we shall discuss how this can be remedied. For the moment, the advantage of the use of explicit embeddings is that there can be no confusion about the ' type ' of an expression (= sorts of objects yielded by the expression).

It should be noted that the introduction of the embedding could be interpreted as a kind of second order existential quantification, but in fact it is somewhat more subtle than this. The function i should be the same (in the COLD sense) in all states, and it is not sufficient for a (possibly different) i to exist in every state.

S.S Inher i tance of functions and predicates

As the next step, we introduce inherited definitions for functions. We assume we have the sorts V and W, and the sort T as defined in Section 3.2. An example of an inherited function definition ls given by:

F~C f : T -> W IS f : V -> W

It defines the function f : T -> W by inheritance from the function f ": V -> W. The operation inheritance requirement implies that the operation on the right-hand side of the IS symbol is a function with the same functionality as that of the function on the left-hand side, except that a sort B occurring at the right-hand side may occur as a sort A, inherited from B, at the left-hand side. In this case it implies that T < V should hold, which is indeed the case. In general, this requirement can be checked automatically by the language implementation. The above definition is an abbreviation for the following definition in COLD-K:

FUNC f : T -> W

PAR x: T

DF2 ~(iCx))

I t is easy to see that this complies with the requirement of semantics preservation. Another example is the following inherited function definition:

FUNC g : T # V -> V

which is an abbrevlation for:

IS g : V # V-> V

FUNC g : T # V -> V

PAR x:T,y:V

DEF g(i(x) ,y)

In the two examples above we assumed that the function f : V -> W was inherited as

286

the function f : T -> W and that the function g : V # V -> V was inherited as the function g : T # V -> V. These examples were inspired by the inheritance mechanism of tradit ional object-oriented languages, which t reat the first argument of operations in a special way. Apar t f rom the fact that in COLD we do not assign such a special role to the first argument of an operation, this need not be the desired form of inheritance at all. For the operation f : V -> W there is indeed not much choice, but the operation g : V # V -> V could be inherited in any of the following forms:

g:V#V->V g:T#V->V g:V#T->V g:T#T->V g:V#V->T g:T#V->T g:V#T->T g:T#T->T

For example, if V is the sort of integers, T is the sort of natural numbers and the function g : V # V -> V is the addition operation on integers, we shall generally want to inherit the addition operation on natural numbers as an operation with functionality g : T # T -> T. The flexibility to make this choice is provided by the mechanism introduced above, since we can defne g by:

FUNC g : T # T -> T IS g : V # V -> V

The COLD-K text for which this stands is similar to what we have already seen, except for one addition. Because we now have an inherited sort in the range of the function g, we must require tha t the subset of V corresponding to T is closed with respect to the operation g : V # V -> V. Otherwise, it would be impossible to satisfy the requirement of semantics preservation and make g : T # T -> T semantically equivalent to the restriction of g : V # V -> V to T # T. The above inherited definition of g is therefore viewed as an abbreviation for:

FUNC g : T # T -> T

PAR x:T.y:T DEF SOME z:T (i(z) = g(i(x),i(y)) )

AXIOM FORALL x:T.y:T ( g(i(x),i(y))! => g(x.y)! )

Here the postfix operator ! is the definedness predicate, stating tha t the value of an expression is unique and defined. Together with the definition of g, the axiom implies tha t T is indeed closed with respect to g or, in other words, tha t g behaves homomorphically with respect to i:

FORALL x : T , y : T ( g ( x , y ) ! <=> g ( i ( x ) , i ( y ) ) ! ; g ( x . y ) ! => i ( g ( x , y ) ) = g ( i ( x ) , i ( y ) ) )

Note that this property is essentially the same as the requirement of semantics preset-

287

vation, which is thereby immediately satisfied. If the axiom is not satisfied, the above form of inheritance leads to a (justified) inconsistency in the class description. So, when using this form of inheritance, we have to verify that the subsorts defined by inheritance are closed with respect to the inherited functions. In the special case that all occurrences of the sorts being inherited are replaced by subsorts in the domain and range types of the inherited operations, this is the same as saying that we have to verify that the subsorts together with the inherited functions constitute a subalgebra of the algebra from which they are inherited.

We note that there are several strongly typed object-oriented programming languages that also have facilities for changing the domain and range type of an inherited operation in other than the standard ways, usually by providing some form of redefinition of operations (cf. 'definition by association' in Eiffel [9]). However, these mechanisms are often rather ad hoc and are complicated by the asymmetric t reatment of the arguments of operations.

Having defined inheritance for functions, the extension to predicates is obvious. Sup- pose, for example, that we have the following predicate:

PREDr:V#V

An inherited version of this predicate might be defined by:

PRED r : T # T IS r : V # V

which is an abbreviation for:

PRED r : T # T

PAR x:T,y:T

DEF r(i(x) ,i(y))

The problem with inherited sorts in the range of the operation does not occur here, of course. Note that the operation at the left-hand side of the IS symbol need not necessarily have the same identifier as the operation at the right-hand side (even though that is the case in all the examples above).

3.4 Inher i tance of procedures

The extension of the inheritance mechanism to procedures introduces an interesting new aspect: that of side effects in general, and dynamic object creation in particular. In traditional object-oriented languages operations for creating new objects are often t reated differently from all other operations. For example, in Eiffel the 'Create' operation cannot be inherited. The reason is that a create operation does not have the functionality of a regular 'method' since the standard first argument (of the object executing the method) is lacking. This is sometimes expressed by saying that object creation is an operation of the associated class rather than of some object. As we shall see we can treat creation operations in COLD as normal operations, making the t reatment of inheritance uniform over all kinds of procedures.

Suppose that we have the following two procedures (with everything else as before):

288

PR0C p : V -> W PR0C q : V # V -> V

Let us consider p first, which is the 'simple' case. I t will not come as a surprise that the inherited procedure definition:

PROC p : T -> W IS p : V -> W

is an abbreviat ion for:

PR0C p : T - > W PAR x :T DEF p ( i ( x ) )

I t is again easy to see tha t the requirement of semantics preservation is satisfied.

As is the case with functions, the occurrence of an inherited sort in the range of a procedure leads to the requirement that the sort is closed with respect to the procedure. This requirement can be satisfied in the same way as for functions, except that we have to cope with the fact tha t a procedure may have side effects. The following inherited procedure definition:

PROC q : T # T - > T I S q : V # V - > V

is therefore equivalent to the following definitions in COLD-K:

PROC q : T # T -> T PAR x : T , y : T DEF LET w:V; w := q(i(x),i(y));

SOME z:T (i(z) = w )

AXIOM FORALL x : T , y : T ( ( < q ( i ( x ) , i ( y ) ) > TRUE) => <q(x ,y )> TRUE )

Here the assertion <p(x)> TRUE for procedures p is the counterpart of the definedness assertion f ( x ) ! for functions. I t states that the call p (x ) will te rminate and tha t the result of the procedure is defined. The argument tha t the requirement of semantics preservation is satisfied now consists of two parts. Firstly, q : T # T -> T will yield the ' same ' values as q : V # V -> V, which can be inferred in the same way as with functions. Secondly, q : T # T - > T will have the ' same ' side effects as q : V # V

-> V, as can be inferred from the defining body of q : T # T -> T.

The above inherited definition implies tha t if the call q ( i ( x ) , i ( y ) ) of q : V # V -> V delivers a new object of sort V as its result, then the call q ( x , y ) of q : T # T -> T delivers a new object of sort T as its result. This can be seen by considering what happens when executing the call q ( x , y ) . According to the definition of q : T # T -> T, the call q ( i ( x ) , i ( y ) ) is executed, yielding a new object w:V. The above axiom and the fact tha t the call q ( i ( x ) , i ( y ) ) tern~nates implles tha t the call q ( x , y ) will also tern~nate. Hence q ( x , y ) will yield some z :T with i ( z ) = w. Since sort T is dependent on V, the call q ( i ( x ) . i ( y ) ) may modify sort T, implying tha t z may be 'old ' or 'new' . I f z is old, then due to the fact tha t i is a constant function and i (z) = w, w would also be an old object. From thls contradiction we infer tha t z is new. So

289

the inheritance of procedures yielding 'new' objects constitutes no problem at all and does exactly what we expect from it.

As a final remark we note that the analogy of inheritance for functions and procedures goes so far that the syntactic sugar used for inherited procedure definitions could be used in exactly the same form for inherited function definitions (but not the other way around). For example, if we replace q by g in the COLD-K equivalent of the inherited definition of q and also replace the keyword PR0C by FUNC, we get a definition that is equivalent to that of the function g given in Section 3.3:

FUNC g : T # T -> T

PAR x:T,y:T

DEF LET w:V; w := g(i(x),i(y));

SOME z:T (i(z) = w )

AXIOM FORALL x:T,y:T ((<g(i(x) ,i(y))> TRUE) => <g(x.y)> TRUE )

It is only due to the absence of side effects that we can simplify this in the case of functions.

4 R e f i n e m e n t o f the m e c h a n i s m

4 . 1 M u l t i p l e i n h e r i t a n c e

The generaiisatlon of the above scheme so as to allow multiple inheritance of sorts seems quite straightforward. For example, if we want to define the sort T by inheritance of the sorts V, and V~, we can describe this by:

SORT T IS Vl, V 2

which stands for:

SORT T DEP V,, V2

FUNC ii : T -> Vl

FUNC £2 : T -> V2

AXIOM FOP, ALL x:T.y:T

(it(x) " it(y) <'> x = y ; i=(x) = i=(y) <=> x = y )

The corresponding picture is given by Figure 2.

There is a catch, though, in the case that the sorts V, and V~ have been defined themselves by inheritance and have a common %ncestor'. For example, let V, and V~ be defined by:

290

v

Figure 2: Multiple inheritance of Vl and V2 by T

SORT Vl IS W SORT V2 IS W

with the following associated embeddings:

FUNC jx : Vx -> W FUNC j2 : V2 -> W

then there is nothing that prevents the object it(it(x)) from being different from the object j2(i2(x)) for some x:T. In other words, an object x of sort T could in principle be identified with two different objects of sort W, which is of course not what we want (see Figure 3 for the desired situation). It would be in conflict with the sort inheritance requirement stating that the inheritance relation on sorts can be viewed as a subset relation. We can prevent this by imposing an additional requirement on the embeddings associated with sorts defined by multiple inheritance of sorts with a common ancestor. The requirement is rather obvious and, in this case, looks as follows:

FORALL x:T (if(it(x)) = j2(i2(x)) )

The next point to be considered is the multiple inheritance of operations. Suppose that sort T is defined as above by multiple inheritance of the sorts Vx and V= and that we have the following two functions:

FUNC g : Vl -> Vx

FUNC h : V2 -> V~

then we could be tempted to define a function f : T -> T by multiple inheritance of g and h:

FUNC f : T -> T IS g : Vl -> Vl, h : V2 -> Vz

The basic idea here is that f is the same as the restrictions of g and h to the subsets

291

V 1

Figure 3: Multiple inheritance of sorts with a common ancestor

292

of Vl and V~ corresponding to T. As before, the requirement is that g and h are closed on the subsets of Vl and V2 correspondlng to T. An additional requirement is that the restrictions of g and h to T are 'compatlble', otherwise we cannot define f as being semantically equivalent to the restrictions of both g and h. This requirement amounts to the fact that the following property should hold:

FORALL x :T .y :T .z :T (iz(z) = g(i1(x),it(y)) <=> i~(z) = h(12Cx).i2(y)) )

If g and h are unrelated operations this may be a requirement that is hard to satisfy (by exploiting the freedom available in the selection of the embeddlngs i l and is), but in that case one may wonder why anyone would want to define f by multiple inheritance of g and h. Though this road to introducing multiple inheritance of operations could certainly be followed, the full generality provided by it is hardly ever necessary in practice. (There are few, if any, object-oriented programming languages that provide such a general scheme.) Instead, we shall propose a somewhat more restricted approach to the multiple inheritance of operations that will make it possible for the language implementation to enforce the second requirement automatically.

The above problem lies in the fact that the operations g and h may he completely unrelated. Now consider the following example. Suppose that we have defined the natural numbers Nat with a number of operatlons such as 0 : -> Nat, 9. : -> Nat and add : Vat # Nat -> Nat (addition). Suppose also that we have defined the sorts Nat,- and Nat3 of the natural numbers divisible by 2 and 3, respectlvely, by inheritance of Nat:

SORT Nat2

FUNC 0 : -> Nat2

FUNC 2 : -> Nat2

FUNC 4 : -> Nat2

FU~C 6 : -> Nat2

FUNC add : Nat2 # Nat2 -> Nat2

IS Nat

IS 0 : -> Nat

IS 2 : -> Nat

IS 4 : -> Nat

IS 6 : -> Nat

IS add : Nat # Nat -> Nat

AXIOH FORALL m:Nat2 (mod(i2(m)o2) = 0 )

o . .

SORT Nat3

FUNC 0 : -> Nat3

FUNC 3 : -> Nat3

FUNC 6 : -> Nat3


IS Nat

IS 0 : -> Nat

IS 3 : -> Vat

IS 6 : -> Nat

IS add : Nat # Vat -> Vat

AXIOM FORALL m:Nat3 (mod(is(m).3) = 0 )

* o .

Here is and is are the embeddlngs from Nat2 to Nat and Nat3 to Nat, respectively, and rood : Nat # Nat -> Nat is the modulus function. We could now define the sort Nat6 of the natural numbers divisible by 6 by multiple inheritance of Nat2 and Nat3. It then makes sense to also define 0, 6 and addition on Nat6 by multiple inheritance of the corresponding operations on Nat2 and Nat3:

293

SORT Nat6

FUNC 0 : -> Nat6

FUNC 6 : -> Nat6


IS Nat2. Nat3

IS 0 : -> Nat2, 0 : -> Nat3

IS 6 : -> Nat2, 6 : -> Nat3

IS add : Nat2 # Nat2 -> Nat2,

add : Nat3 # Nat3 -> Nat3

Here we see, at least intuitively, that the compatibility requirement for the operations inherited in a multiple operation definition causes no problems. The add operations on Nat2 and NatS, for example, come from the same ancestor: both have been defined by inheritance of add on Nat, so they are the %ame' when restricted to Nat6.

Of course, we could also have defined add on Nat6 by the single inheritance of either add on Nat2 or add on Nat3 (or even the single inheritance of add on Nat), but that would not have been in the spirit of the object-oriented approach. As defined above, the information is passed to the language implementation that the call add (x ,y ) with x :Na t6 and y :Nat6 may be implemented by using an implementation (if available) of any one of the following operations:

add : Nat # Nat -> Nat




(In an implementation, the embeddings typically reduce to dummy operations and objects of a subsort are represented directly by objects of the supersort. So operations on objects of sort Nat can be applied directly to objects of sort Nat2, etc.)

The solution we propose here is therefore that multiple inheritance of operations ks allowed only if the operations at the rlght-hand side of the IS symbol have a common ancestor (possibly going back several generations), which is something that can be checked statically by the language implementation. Returning to the definition by multiple inheritance of the function f , and assuming that g and h indeed have a common ancestor, we can view the definltion:

FUNC f : T -> T IS g : V 1 -> V 1 , h : V2 -> V2

as an abbreviation for:

FUNC

PAR

DEF

f :T->T

x:T

( SOME z:T (i,(z) = g(i1(x)) )

[SOME z:T (i2(z) = h(i2(x)) ) )

AXIOM FORALL x:T

( g(i,(x))! => f(x)!

; h(i~(x))! => f(x)! )

Here we have used the choice operator to indicate that it does not mat ter whether the

294

result of f is obtained by means of g or by means of h: both alternatives yield the same result, due to the closure property and the common ancestor requirement. The fact that the operation inheritance requirement, and in particular the requirement of semantics preservation, is satisfied can now be checked in a simple way. The generalisation of single inheritance to multiple inheritance for predicates and procedures follows exactly the same lines as for functions, and will therefore be omitted.

As a final remark we note that the notion of inheritance introduced in this paper can be characterised in algebraic terms as follows. It amounts to the existence of a monomorphism from the sorts and operations being defined by inheritance to the sorts and operations being inherited (where, in the case of multiple inheritance, there are several such monomorphisms). The monomorphism is represented by the embedding associated with the inherited definition. Referring to the embedding problem for objects of inherited sorts with common ancestors (see Figure 3), we note that the following property holds: if monomorphisms exist from T to Vl, T to Vz, V~ to W and V2 to W, then also corresponding monomorphisms £I, is, j l , j2 exist, such that j l ( i i ( x ) ) = j 2 ( i 2 ( x ) ) for all x.

4 . 2 O m i t t i n g t h e e m b e d d i n g s

For the practical applicability of the approach described here it is essential that the embeddlngs remain hidden from the user as far as possible. The simplest way to achieve this is to allow the user to omit applications of the embeddlngs in expressions completely. It should be possible to use an object x of sort T in any context where an object of sort V is expected, provided that T has been defined (directly or indirectly) by inheritance of V. The language implementation should insert the necessary applications of the embedding(s) leading from T to Vo The consequences of this approach are discussed below.

First we note that the omission of the embeddings is somewhat similar to the omission of type information from operations in COLD-K. In COLD-K the name of the function f : A -> B consists not only of the identifier f but also of the sort names A (the domain type of f) and B (the range type of 5). So the functions f : A -> C and f : C -> B are different from f : A -> B. In the concrete syntax of COLD-K the user may omit the domain and range types and simply write 5, even if f is overloaded. The language implementation will reconstruct the domain and range types of f from the context, if possible. Ambiguities are reported by the language implementation and can be repaired by the user by inserting explicit type information. Thus, in spite of the presence of overloading, the language can be kept strongly typed.

The important question is whether the strong typing requirement can be satisfied in this case as well, particularly in combination with overloading. In order to show that this is indeed the case, let us assume that we are in a context where we have a number of sorts and operations together with an inheritance relation _< defined on them. We note that we have already required that the inheritance relation on sorts constitutes a partial order, implying that the inheritance relation on operations is also a partial order. The strong typing requirement implies that a unique type can be associated with every expression and that a unique operation name (including domain and range types) can be associated with every identifier denoting an operation. Once this association

295

has been accomplished, the insertion of the applications of the embeddings is simple. For example, suppose we have the sorts T, Vi, V2 and W with the inheritance relation and embeddings shown in Figure 3, and the following expression:

f (x)

If f has domain type Vl and range type W, and x has type T, then clearly there is only one way to insert the embeddings:

f ( i l (x))

If f has domain and range type W, and x has type T, then there are two ways to insert the embedding:

f ( j l ( i l ( x ) ) ) f ( j 2 ( i 2 (x ) ) )

Due to the 'rhombic property' (see Figure 3) of the embeddings, it does not matter which one is chosen. A more elegant solution than choosing just one is to introduce the embedding from T to W explicitly by the following definition:

FUNC j i : T -> W PAR x:T

DEE ( jl(il (x)) I j2(i~(x)) )

and insert an application of j i in f (x). The above indicates that, due to the fact that the inheritance relation is a partial order, insertion of the applications of the emheddings is unambiguous.

The problem now boils down to the association of a unique type with every expression and a unique operation name with every identifier denoting an operation. In order to accomplish this association only a slight modification of the overloading resolution scheme used in the COLD-K type checker is necessary. This scheme is essentially the same as that described in [1], though somewhat more general due to the more liberal form of overloading in COLD-K. The main modifications are:

1. The sets of ~possible types' of expressions that play a role in the overloading resolution scheme should be closed in the following sense. If T is contained in the set of possible types of an expression, then so is any type V with T _< V.

2. Instead of requiring that each expression has a unique type, it should be required that each expression has a unz'que m{n~'mal t~/pc, which is the type to be associated with the expression.

Similar modifications apply to the manipulation of the sets of ~possihle operation names'. It may seem a problem that the sets of possible types of expressions may become very large, in particular because types may be cartesian products of sorts. This problem can be solved by representing the sets of possible types by the sets of their minimal elements~ which can be clone because all sets of types are closed (in the sense defined above).

296

We conclude that the complete omission of the embeddings from expressions causes no problems with respect to strong typing, even in the presence of overloading. Expres- sions involving inherited sorts and operations can be translated directly to COLD-K by first performing the modified type checking scheme and then inserting the embeddings. In other words: inheritance, overloading and strong typing fit well together.

4.3 Inheri tance of classes

The approach to inheritance described so far would still be rather clumsy from th~ user's point of view, because an explicit 'inherited definition' has to be introduced for every sort and operation defined by inheritance. In the traditional approach to inheritance a complete class is inherited, thus defining a new sort and a set of associated operations at one stroke. By means of a simple example we shall discuss a shorthand that makes it possible to do the same thing. Suppose we have defined the notion of a 'circle' as a graphical object by a class description CIRCLE, defining the following sorts and operations:

SORT Circle

FUNC centre : Circle -> Point

FUNC radius : Circle -> Real

PROC new : Real -> Circle

PROC move : Circle # Point ->

The intended meaning of the operations is as follows. By means of the operation new(r) a new circle with radlus r and undefined centre can be created. The centre and radius of a circle c are given by the instance variables centre (c) and radius (c). The centre of the circle c can be set to m by the procedure move (c .m), while the radius cannot be changed (so radius (c) is in fact an 'instance constant'). We want to define the notion of a 'coloured circle', referred to as a 'blob' in the sequel, by inheriting from the class CIRCLE and adding an additional 'colour' attribute to circles. In the approach indicated so far the inheritance from CIRCLE can be done by means of the following block of inherited definitions:

SORT Blob IS Circle

FUNC centre : Blob -> Point IS centre : Circle -> Point

FUNC radius : Blob -> Real IS radius : Circle -> Real

PROC new : Real -> Blob IS new : Real -> Circle

PROC move : Blob # Point -> IS move : Circle # Point ->

This block can be obtained mechanically from the class description CIRCLE: the names at the right-hand side of the IS symbols are all sort and operation names exported by CIRCLE and containing the sort name Circle , and the names at the left-hand side are obtained from the names at the right-hand side by replacing Circle everywhere by Blob. We could support this by the following notation:

INHERIT CIRCLE[Cirele->Blob] IN

CLASS ... END

which is a shorthand for:

297

IMPORT CIRCLE INTO

CLASS

inherited definitions as above

END

The general form of this notation, and the extension to multiple inheritance, can be derived in a straightforward way fromthis example.

As already remarked in Section 3.3, there are many different ways in which operations can be inherited and the notation outlined above picks out one. Indeed the chosen form of inheritance is a frequently used one, but it makes it impossible, for example, to inherit the operation next : C i rc le -> Ci rc l e as next : Blob -> C i r c l e or as l i n k : Blob -> Blob, as would be possible with the more basic 'inherited definitions'. Hence some refinements must be made in the above notation, but we shall refrain from doing this here.

4.4 Redefinition and dynamic binding

RedeFinition and dynamic binding are two other characteristic features of object- oriented programming, which will be discussed below. Redefinition refers to the possibility to 'redefine' an inherited operation. The reason for redefining an inherited operation is usually that the operation defined by inheritance can be implemented in a more efficient way than the (more general) operation being inherited. A typical example is provided by the 'distance' operation that computes the minimal distance between two graphical objects, called 'figures' in the sequel. In the class FIGURE defining the sort Figure, the distance operation could have been defined as:

FUNC distance : Figure # Figure -> Real

PAR f:Figure.g:Figure

DEF computat~no/the minimal distance o f f and g ~om the eontou~ of f and g

Due to the general nature of figures, the computation of the minimal distance of two arbitrary figures may be quite complicated. Now suppose that we define the notion of C i r c l e by inheritance of Figure. Then the d i s t ance operation could be redefined more efficiently as follows:

FUNC distance : Circle # Circle -> Real PAR c:Circle.d:Clrcle

DEF max(O,distance(centre(c),centre(d))-radius(c)-radius(d))

where the d i s t ance operation on points is supposed to be defined somewhere else. The redefined operation d i s t ance : C i rc l e # C i r c l e -> Real should of course satisfy the same requirements as when it was defined directly by inheritance of d i s t ance : Figure # Figure -> Real. In particular, it should satisfy the requirement of semantics preservation. In contrast with the basic form of inheritance, the verification of this requirement will generally have to be done by hand, since it cannot be done automatically.

298

The call d i s t a n c e (c .d) , where e and d have type C i r c l e , can now be implemented directly in terms of the efficient code defined above, instead of the less efficient code associated with the general d i s t a n c e function. Yet the redefined operation provides more opportunities for improving the efficiency than this. When executing a program, the language implementation could tag each newly created object with its initial sort. Since an object of sort T may be used in any context where an object of sort V with T < V is expected, the sort of the object may change dynamically to a sort 'higher' in the inheritance hierarchy. Now suppose that f and g are object names of type Figure. Though syntactically speaking f and g have type Figure, it could be the case that dynamically they both denote objects with an initial sort < C i rc l e . In that case the call d i s t a n c e ( f ,g) could be implemented as a call of the function d i s t a n c e : C i r c l e # C i r c l e -> Real, though the decision to do so cannot be made at compile time. (As remarked earlier, in the language implementation the embeddings typically reduce to dummy operations.) The fact that the language implementation can choose the piece of code associated with the call of an operation dynamically, depending on the initial sorts of the actual parameters, is what we mean by dynamic binding.

Redefinition and dynamic binding as defined here are closely related to the notions with the same name in Eiffel [9]. There is a difference, though, in that in Eiffel the 'precondition' of a redefined operation f may be weaker than that of the original operation g while the 'postcondition' of f may be stronger than that of g. This implies that f is no longer semantically the same as g. For example, the call ~ (x) could terminate while in the same situation the call g(x) would lead to abortion (because the stronger precondition of g is violated). Of course, when calling f the precondition of g could be checked instead of that of f , but then it would not make sense to weaken the precondition of f (since it is not used anyway). The fact that the requirement of semantics preservation is violated by the Eiffel approach to redefinition is considered undesirable. Semantics preservation is essential to the approach to inheritance described in this paper, and any relaxation of this principle would complicate matters unnecessarily.

Looking more closely at the Eiffel approach to redefinition, we see that it is actually a mixture of specialisation and generallsation aspects. On the one hand, redefinition is used to exploit the fact that the operation being redefined operates on more specialised objects than the original operation. On the other hand, redefinition is used to make the redefined operation more generally applicable than the (inherited version of the) old operation. The first aspect is covered by the notion of redefinition as defined here. The second aspect is covered in COLD by the notion of implementation associated with components. A component in COLD is a unit of design that associates a specification and an (optional) implementation with a class name. Both specification and implementation are complete class descriptions in COLD, each specifying a class. The typical (but not necessary) situation is that the specification is axiomatic (e.g. using pre/postcondition style) while the implementation is algorithmic. The requirement is that the implementation of a component 'satisfies' its specification, which in the typical ease amounts to verifying that the implementation satisfies the axioms of the specification. Suppose, for example, that p is a procedure associated with the component C and described in the specification of C by the single axiom:

299

A(x) => [ p(x) ] B(x)

This axiom states that if the precondition A(x) holds, then after the call p(x) the postcondition B(x) will hold. We are nowfree to describe p in the implementation of C as a procedure with the following property (to be derived from the code of p):

A(x) OR A'(x) => [ p(x) ] B(x) AND B'(x)

Here we see how the weakening of preconditions and the strengthening of postcon- ditlons can be done in COLD, thus making the implementation of p more generally applicable than can be derived from its specification. The difference with Eiffel is that this 'redefinition' of an operation ( 're-implementation' is perhaps a bet ter word) is always done with respect to the specification of the component to which it belongs. Strictly speaking, a redefined operation with a weakened precondition in Eiffel is not even an implementation of the original operation since it behaves observably differently from the original operation when applied in a situation where the precondition of the original operation does not hold. A precondition in Eiffel is interpreted as a condition that should hold when the operation is called, in contrast with the notion used above that is only a sufficient and not a necessary condition (thus providing implementation freedom).

A related mixture of specialisation and generalisation aspects can be found in the notion of redefinition associated with the 'subtyping' approach to inheritance described in e.g. [10,3]. In this approach an operation f r : T # A r # . . . # A r -> B r associated with objects of sort T and obtained by redefining the operation f r : V # A~ # . . . # A~ -> B r associated with objects of sort V, should satisfy the following conditions:

. . . . ,A >ALB r_<B v

As far as the A and B parameters are concerned this is essentially the same contravari- anee (characteristic for an implementation relation) that we have seen before, though now at the type level rather than at the pre/postcondition level. The A and B parameters are concerned with the generalisation and the T parameter with the specialisation aspects of the operation f t . Though the special t reatment of the first parameter of f r may seem natural from the traditional object-oriented point of view, this notion of redefinition leads to some strange consequences. For example, it is not possible to redefine the d i s t a n c e operation as:

FUNC distance : Circle # Circle -> Real

but only as:

FUNC d i s t a n c e : C i r c l e # F igure -> Real

This implies that only the special properties of the first argument can be exploited.

In our view the specialisation and generalisation aspects of redefinition should be strictly separated. Redefinition in COLD terms covers the first aspect, thus ensuring that inheritance is semantics-preserving. The second aspect is covered by the notion of implementation already included in COLD. The implementation relation is simply not semantics-preserving: the preservation of properties works in only one direction.

300

5 C o n c l u s i o n

Inheritance is defined in this paper as a semantic notion based on the principle of semantics preservation. This contrasts with the notion of inheritance in most object- oriented programming languages such as Smalltalk [6] and C++ [11], where inheritance is very much implementation-orients d and primarily acts as a code sharing mechanism. As we have indicated, the approach described here also supports code sharing, as well as features such as multiple inheritance, redefinition and dynamic binding. Furthermore, we believe that it is considerably simpler than most existing inheritance mechanisms, as demonstrated by the simple way in which it can be added as a form of syntactic sugar to COLD-K.

We have presented inheritance in the context of a strongly typed language with overloading, showing that our notion of (multiple) inheritance can safely be combined with strong typing and overloading. Other examples of object-oriented languages featur- ing multiple inheritance and strong typing are Eiffel [9], Trellis/Owl [10] and Galileo [2,4]. The Eiffel approach to inheritance comes closest to ours, though there are a number of fundamental differences of which we shall mention just a few. First of all, the approach described here applies to dynamic as well as static sorts, thus permitting a uniform treatment of all sorts of objects. In Eiffel a distinction is made between 'simple types' and ~class types', which are treated in completely different ways. Sec- ondly, our inheritance mechanism applies uniformly to all externally visible 'features' (speaking in Eiffel terms) of a class. In Eiffel 'class routines' (such as 'Create') are treated differently from all other features. Thirdly, in the approach described here we clearly distinguish between inheritance as a two-way semantics-preserving operation and implementation as a one-way semantics-preserving operation (supported by the notion of component in COLD). In Eiffel these two things are mixed up, giving rise to problems with pre/postconditions of operations. Complications of this kind are virtually absent in the approach described here.

A point that has not been discussed so far is the classical problem of name clashes in combination with multiple inheritance. The reason is that there is no such problem in the case of COLD. The structuring mechanisms of COLD (modularisation, parameterisation and component construction) include a fundamental solution to the name clash problem through the use of origins [8] and these mechanisms are orthogonal to the inheritance mechanism described in this paper. This implies, for example, that undesired name clashes can be avoided by the existing renaming mechanism and that repeated inheritance is unproblematic.

The addition of inheritance to COLD in the manner described in this paper turns COLD into a fully-fledged object-oriented language. What one obtains is not just an object-oriented programming language, but an integrated object-oriented design language, i.e. an integrated specification and programming language supporting the methodology of object-oriented design. Furthermore, the language unifies the object- oriented approach with the classical abstract data type approach based on algebraic specifications and to a certain extent even with logic programming (as supported by the 'inductive definitions' in COLD). This brings closer one of the main goals of the METEOR project: providing an integrated formal approach to industrial software development. The work on defining and implementing a user-oriented version of COLD

301

supporting the notion of inheritance described (and several more user-oriented features) is currently in progress.

References

[1] A.V. AHO, R. SETHI, J.D. ULLMAN, Compilers, Principles, Techniques, and Tools, Addison-Wesley (1986).

[2] A. ALBANO, L. CARDELLI, R. ORSINI, Galileo: A Strongly-Typed, Interac- tive Conceptual Language, ACM Transactions on Database Systems, Volume 10, Number 2 (1985), 230-260.

[3] P. AMERICA, Inheritance and Subtyping in a Parallel Object-Oriented Language, in: J. B]~ZIVIN, J.-M. HULLOT, P. COINTE, H. LIEBERMAN (Eds.), ECOOP '87, European Conference on Object-Orlented Programming, LNCS 276, Springer- Verlag (1987), 234-242.

[4] L. CARDELLI, A Semantics of Multiple Inheritance, Information and Computa- tion 76 (1988), 138-164.

[5] L.M.G. FEIJS, H.B.M. JONKERS, C.P.J. KOYMANS, G.R. RENARDEL DE LAVALETTE, Formal Definition of the Design Language COLD-K, Preliminary Edition, Technical Report, ESPRIT project 432, Doe.No. METEOR/t7/PRLE/7 (1987).

[6] A. GOLDBERG, D. ROBSON, Smalltalk-80, The Language and its Implementa- tion, Addison-Wesley (1983).

[7] H.B.M. 3ONKERS, An Introduction to COLD-K, in: M. WIRSING, J.A. BERGSTRA (Eds.), Algebraic Methods: Theory, Tools and Applications, LNCS 394, Springer-Verlag (1989), 139-205.

[8] H.B.M. 3ONKERS, Description Algebra, in: M. WIRSING, J.A. BERGSTRA (Eds.), Algebraic Methods: Theory, Tools and Applications, LNCS 394, Springer- Verlag (1989), 283-305.

[9] B. MEYER, Object-oriented Software Construction, Prentice Hall (1988). [10] C. SCHAFFERT, T. COOPER, B. BULLIS, M. KILIAN, C. WILPOLT, An In-

troductlon to Trellis/Owl, Proceedings of the ACM Conference on Object-oriented Programming Systems, Languages and Applications '86, SIGPLAN Notices, Vol- ume 21, Number 11 (1986), 9-16.

[11] B. STROUSTRUP, The C++ Programming Language, Addison-Wesley (1986). [12] B. STROUSTRUP, What is =Object-Oriented Programming"?, in: J. BEZIVIN,

J.-M. HULLOT, P. COINTE, H. LIEBERMAN (Eds.), EGOOP '87, European Conference on Object-Oriented Programming, LNCS 276, Springer-Verlag (1987), 51-70.

A Process Specification Formalism based on static COLD

J.C.M. Baeten 1 J.A. Bergstra 2, 3 S. Mauw 2 G.J. Veltink 2

1: Department of Software Technology, Centre for Mathematics and Computer Science. P.O. Box 4079, 1009 AB Amsterdam

2: Programming Research Group, University of Amsterdam P.O. Box 41882, 1009 DB Amsterdam

3: Department of Philosophy, State University of Utrecht Heidelberglaan 2, 3584 CS Utrecht

Abstract:

PSF/C is a formal specification language, based on COLD, a wide spectrum specification language developed at Philips Research, Eindhoven. In PSF/C, we can specify concurrent communicating processes. The process syntax and semantics is based on the algebraic concurrency language ACP.

Note: The first three authors are sponsored by ESPRIT contract 432, A Formal Integrated Approach to Industrial Software Development (METEOR). The first and second author are also sponsored by RACE contract 1046, Specification and Programming Environment for Communication Software (SPECS).

ACKNOWLEDGEMENT

The writers would like to thank Jan Rekers for assistance concerning the use of the SDF system.

1 INTRODUCTION PSF/C is an experiment in language design. It is not meant as a finished language that would justify the substantial efforts of writing its necessary tools. PSF/C is a language in which we can specify concurrent communicating processes. Moreover, we have ample facilities to specify data types. These data types can occur as parameters of actions and processes. Also, we have a modular structure: data types and processes are defined in modules. Parts of the signature of these modules can be exported or hidden. The starting point for construction of PSF/C has been the wide spectrum language COLD, developed at Phillps Research, Eindhoven. From COLD, we get data type specifications and the modular structure with imports and exports. On top of that, we specify processes and their interaction in the spirit of the concurrency theory ACP of [BK84]. The design objectives have been:

• to combine ACP and the static part of COLD in one language where the concrete syntax is borrowed from COLD;

• to combine processes and data in a similar fashion as is done in PSF/ASF of [IVlV88], where data are used as parameters of actions and process names;

• to obtain a semantic description of the language by means of a translation to COLD;

* to generate a parser for the syntax by means of the SDF system of the Esprit project 2177 (O_aPE). (see [BHK89]).

304

2 THECOLD-S LANGUAGE

In this section we will present COLD-S, which is obtained by dropping all dynamic features from the language COLD-K and leaving out renamings and parameterization. In fact this language is a subset of COLD-K 2 in RENARDEL DE LAVALETrE ~dL89]. The language COLD-K has been developed in the framework of ESPRIT project 432, METEOR (see FEIJS, JONKERS, KOYMANS & RENARDEL DE LAVALETrE [FJKR87] or [WB89] ). COLD-K has been designed to be a so-called wide spectrum language in which it should be possible to capture the whole spectrum of software development. The language supports transformational design, in which implementations are constructed from specifications by replacing, step by step, all parts of the specification by equivalents that show more and more aspects of an executable language. Like COLD-K, COLD-S is defined by means of a translation of its grammatical constructs to the constructs of a layered formal language. Because COLD-S does not have the parameterization concept of COLD-K, the top layer of this kernel, X~r, is left out and so COLD-S only uses two instead of three layers. Expressions in COLDS start off with terms from a special many-sorted algebra, called CA for Class Algebra, which is used for modeling modularization constructs. This algebra constitutes the middle layer. The constants used in the terms of this algebra are presentations of logical theories. The logical language used at the bottom level is based on a special infinitary logic, called MPLc0. Every construct in a COLD specification corresponds with an expression in the kernel of formal languages with a well-defined semantics. COLD specifications are translated by means of attribute grammars to the kernel. In some cases, we want to restrict COLD-K in another way, by taking the algebraic subset COLD-K 2 as described in [RdL89]. We obtain COLD-K 2 by restricting all axioms in the language to the format of conditional equations, and restricting all functions to total functions. Obviously, COLD-SA will be the static algebraic part of COLD-K.

2.1 Some Remarks on the Language

Like COLD-K, the language COLD-S consists of a number of hierarchically ordered sublanguages. This hierarchy is illustrated by the following picture:

Design Language

Scheme Language $

Class Language $

Definition Language $

Assertion Language

In the following sections we will explain each language in some more detail.

2.1.1 The Assertion Language In the assertion language we can write terms and assertions. The assertions in COLD-K or COLDS are exactly the formulae of MPL, the underlying many-sorted predicate logic. In the case of COLD-K 2 we only allow (universally quantified) conditional equations.

2.1.2 The Definition Language In the definition language we come across the items that are defined in the COLDS language, viz.: sorts, predicates and functions. A definition can be seen in two ways: a declarative and a definitional

305

way. The declarative part introduces the name of an item and possibly its type, while the definitional part defines the meaning of the item introduced. Not all definitions show both aspects. Sort definitions only have a declarative aspect, while axioms are purely definitional. Predicates and functions are both declarative and definitional, their meaning is defined directiy, by a defining term or an assertion, or indirectly, by an inductive definition or an axiom. Inductively defined predicates and functions are defined as the smallest predicate or function satisfying the inductive definition.

2.1.3 The Class Language The class language is used to group a list of definitions into a modular structure which is called c/ass in COLD-S. The signature of a class is the collection of sorts, functions and predicates that are defined in that partioalar class.

2.1.4 The Scheme Language

All operations that have to do with the modularization of specifications are dealt with in the scheme language.

These operations are :

s import of classes

• export of objects from a class

• introduction of abbreviations

2.1.5 The Design Language

The design language is used to handle specifications at the highest level At this level the so-called components, which will finally be used to specify the complete system, are specified. A component can be either a specification, in which case it is called a specified component, or a specification together with an implementation written in COLD-S, in which case it is called an implemented component. Specified components are used when the implementation of a component cannot be described in COLD-S, because it is a piece of hardware or an existing program in some kind of programming language.

2.2 The Grammar

The definition of the context free grammar of COLD-S is given using a certain BNP-grammar augmented with the following extra rules:

{X} denotes zero or more occurrences of X (a list of X's) IX] denotes zero or one occurrences of X (an optional X) { X '@' } denotes zero or more occurrences of X. The symbol @ acts as delimiter.

Then, the granmmr of COLD-S is defined as follows:

<design> ::= DESIGN (<component> ';'} SYSTEM (<scheme> ','}

<component> ::=COMP <scheme-var> : <scheme> [:~ <scheme>] [ LET <scheme-vat> := <scheme>

<scheme> ::= <class> ] IMPORT <scheme> INTO <scheme> I EXPORT <signature> FROM <scheme>

LET <scheme-var> := <scheme> ; <scheme> <scheme-vat>

<signature> ::~ {<item> ','} <signature> + <signature>

I <item> ^ <signature> l SIG <scheme>

306

<item> ::= SORT <sort-name> I PRED <predicate-name> : domain I FUNC <function-name> : domain -> <sort-name>

<class> ::= CLASS {<definition>} END

<definition> ::= SORT <sortname> PRED <predicate-name> : domain <predicate body>

I FUNC <function-name> : domain -> <sort-name> <function body> I AXIOM <assertion>

<predicate body> ::= lIND <assertion>] I [PAR <varsort list>] DEF <assertion>

<function body> ::= [IND <assertion>] I [PAR<varsort list>] DEF <term>

<assertion> ::= TRUE FALSE <term>! <term> = <term> <predicate-name> <term list> NOT <assertion> <assertion> ; <assertion> <assertion> AND <assertion> <assertion> OR <assertion> <assertion> => <assertion> <assertion> <=> <assertion> FORALL <varsort list> <assertion> EXISTS <varsort llst> <assertion> LET {<assignment> ','} ; <assertion> { <assertion> )

<term> ::= <object-vat> I <function-name> <term list>

THAT <varsort> <assertion> LET {<assignment> ','} ; <term>

I ( <term> )

<term list> ::= {<term> ,,w} I ( <term list> )

<domain> ::= {<sort-name> '#'}

<varsort list> ::= {<varsort> ','}

<varsort> ::= <object-vat> : <sort-name>

<assignment> ::= <object-vat> := <term>

<scheme-vat> ::= <identifier>

<sort-name> ::= <identifier>

<predicate-name> ::= <identifier>

<function-name> ::= <identifier>

<object-vat> ::= <identifier>

3 PSF/C

The concrete syntax of PSF/C is almost identical to the concrete syntax of COLD, with the exception of the additional language constructs we need to represent atomic actions, processes etc. To indicate we restrict ourselves to the static part of COLD, COLD-S, we write PSF/C. Similarly, for PSF/CA we use the static algebraic part of COLD, COLD-SA.

307

3.1 Character Se t

A ~ F / C s p e ~ c a f i o n ~ e s the ~ m e A S ~ ~ c ~ r set ~ COLD, ~ . :

! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9

; < = > ? @ A B C D E F G H I J K L M N 0 P Q R S T

V W X Y Z [ \ ] ^ a b c d e f g h i j k i m n

q r s t u v w x y z { I } ~

U

o p

3.2 Tokens

In parsing a PSF /C specification a series of tokens is recognized. Each token is a sequence of ASCII characters and tokens are separated by spaces, tabs and new lines. In cases of ambiguity the longest token that can be recognized is preferred. There are three kinds of tokens, viz. identifiers, keywords and comments. We will discuss these in turn in the following sections.

3.2.1 Identifiers

Identifiers in PSF/C are arbitrary non-empty strings consisting of letters, digits and the fol lowing four characters:

excluding those strings which are keywords. Two characters that can be part of a COLD identifier are excluded namely the dot '.' and the backslash ' \ ' . The dot has become a keyword, represent ing sequential composition and the backslash is reserved to be used as a special character that a program translating PSF /C into COLD-K can use to dist inguish user defined identifiers f rom identifiers generated by the translator.

3.2.2 Keywords

The following strings are PSF/C keywords:

! ^ FORALL PRETAU # A C U O N FROM PROCESS & A N D F U N C SET ( AXIOM GCMD SIG ) CLASS HIDE SORT + COMM IMPORT SPEC , COMP IND SUM -> DEF INTO SYSTEM

DELTA LET T H A T : DESIGN MERGE TRUE := ENCAPS NOT W I T H ; END OF l <=> EXISTS OR I~ = EXPORT PAR => FALSE PRED

3.2.3 Comments

There are two possible ways to create a comment. The first is to use the comment brackets: '{' and '}', which turn the enclosed text into a comment. Comment brackets cannot be nested and the enclosed text may not contain a '}'. Example:

{ This is a comment }

308

The second way to create comment is by using the sign the '%', which ~rns the rest of the line into a comment. Example:

% This is comment

Comments may be inserted between any two tokens and have no meaning in terms of the abstract syntax.

3.3 Grammar

The PSF/C grammar is given in the foUowing sec~on. In fact it ~ an extension of the COLD-S grammar presented in sec~on 2.

<design> ::= DESIGN {<component> ,;v) SYSTEM (<scheme> ',']

<component> ::= COMP <scheme-var> : <scheme> [:= <scheme>] I LET <scheme-var> := <scheme>

<scheme> :== <class> I IMPORT <scheme> INTO <scheme> I EXPORT <signature> FROM <scheme>

LET <scheme-vat> := <scheme> ; <scheme> <scheme-var>

<signature> ::= {<item> ','} ] <signature> + <signature>

<item> ^ <signature> SIG <scheme>

<item> ::= SORT <sort-name> i PRED <predicate-name> = domain I FUNC <function-name> : domain -> <sort-name> J ACTION <action-name> : domain I PROCESS <process-name> : domain l SET <set-name>

<class> ::= CLASS {<definition>} END

<definition> ::= SORT <sortname> PRED <predicate-name> : domain <predicate body> FUNC <function-name> : domain -> <sort-name> <function body> AXIOM <assertion> ACTION <action-name> : domain PROCESS <process-name> : domain <process body> SET <set-name> <set body> COMM <comm assertion> SPEC <spec body>

<predicate body> ::= {IND <assertion>] l [PAR <varsort list>] DEF <assertion>

<function body> ::= lIND <assertion>] [PAR<varsort list>] DEF <term>

<process body> ::= [[PAR <varsort list>] DEF <process expr>]

<set body> ::= [IND <assertion>]

<assertion> ::= TRUE FALSE <term> ! <term> = <term> <predicate-name> <term list> <set-name> <action term list> NOT <assertion> <assertion> ; <assertion>

309

<assertion> AND <assertion> <assertion> OR <assertion> <assertion> => <assertion> <assertion> <=> <assertion> FORALL <varsort list> <assertion> EXISTS <varsort list> <assertion> LET (<assignment> ','} ; <assertion> ( <assertion> )

<comm assertion> ::= <action term> ~ <action term> = <action term> I <oomm assertion> ; <comm assertion>

FORALL <varsort list> <con~n assertion> I ( <comm assertion> )

<spec assertion> ::~ <process-name> <term list> = <process expr> <spec assertion> ; <spec assertion>

t FORALL <varsort list> <spec assertion> J ( <spec assertion> )

<term> ::= <object-var> I <function-name> <term list>

THAT <varsort> <assertion> LET {<assignment> ','} ; <term>

I ( <term> )

<action term list> ::= (<action term> ','} ( <action term llst> )

<action term> ::= <action-name> <term list> l ( <action term> )

<term list> ::~ {<term> ','} I ( <term list> )

<process expr> ::= PRETAU DELTA <process-name> <term list> <process expr> . <process expr> <process expr> + <process expr> <process expr> }~ <process expr> GCMD <ass-process expr> SUM <varsort list> <process expr> MERGE <varsort list> <process expr> ENCAPS <set-process expr> HIDE <set-process expr> ( <process expr> )

<set-process expr> ::~ <set expr> • <process expr> I (<set-process expr>)

<ass-process expr> ::~ <assertion>, <process expr> I (<ass-process expr>)

<set expr> ::~ <set-name> I <set expr> + <set expr> I <set expr> & <set expr>

<set expr> ^ <set expr> I ( <set expr> )

<domain> ::= {<sort-name> '#'}

<varsort list> ::= {<varsort> , v}

<varsort> ::~ <object-vat> : <sort-name>

<assignment> ::= <object-vat> :4 <term>

<scheme-var> ::= <identifier>

310

<sort-name> ::= <identifier>

<predicate-name> ::= <identifier>

<function-name> ::= <identifier>

<action-name> ::= <identifier>

<process-name> ::= <identifier>

<set-name> ::= <identifier>

<object-var> ::= <identifier>

3.4 SDF Definition

Next, we give a definition of PSF/C in the Syntax Definition Formalism of HEERING & KLINT [HK89]. SDF stands for: "Syntax Definition Formalism'. It is a language to specify the lexical syntax, context- free Syntax and abstract syntax of programming languages in a formal way and can be seen as an alternative to LEX [Joh79] and YACC [LS79]. It is possible to generate a lexical scanner and some parse tables from such an SDF-definition [Rek87]. These parse tables together with a universal parser form a parser for the specified language. It is also possible to generate a so-caUed syntax directed editor from a description of the layout and the parse tables. This whole system is being implemented in LISP as part of ESPRIT Project 2177: G1PE (Generation of Interactive Programming Environments).

3.4.1 SDF Syntax An SDP definition consists of two parts: a Iexical syntax and a context-free syntax. In both parts we deal with the notions sort and function that correspond, respectively, t o non-terminals and to production rules as used in BNF grammars [AU77]. This is an adaptation of an example of an SDF definition taken from [HK86].

module example

begin

lexlcal syntax

sorts digit, letter, int, id, id-tail, comment-char

layout

white-space, comment

functions [a-z] -> letter

[0-9] -> digit digit+ -> int [a-zO-9] -> id-tail letter id-tail* -> id [ \n\t\f\r] -> white-space ~[{}] -> comment-char "{" comment-char* "}" -> comment

context-free

sorts expr

syntax

311

priorities -+- < -.-

functions expr "+" expr -> expr {par, left-assoc} expr "*" expr -> expr {par, left-assoc}

id -> expr

end example

We will point out some of the SDF constructions that appear in this example. The sorts and layout declarations, in the lexical syntax section, introduce the lexical sorts while their functions declarations specify what kind of strings can be constructed over these sorts. Elements of the context-free syntax may be interspersed with strings belonging to the layout sorts. The latter will be skipped by the lexical analyzer generated from the SDF definition. The function declaration may be composed of other lexical sorts, (negated) character classes, terminals and list expressions. In the lexical syntax section two kinds of list expressions are allowed:

S* zero or more occurrences of sort S

S+ one or more occu~ences of so~ S

In the function declaration of the context-free syntax section lexical sorts may be used as terminals of the grammar, though terminals may also be introduced directly, like "+" and "*" in the example. Moreover two more list expressions are allowed:

{S t}* zero or more occurrences of sort G separated by the terminal t.

{S t}+ one or more occurrences of sort S, separated by the terminal t.

The priorities declaration is used to define the relative priority between functions. When unambiguous, the function may be abbreviated by its keyword skeleton. The associativity of functions may be declared by means of the attributes: assoc, lefi-assoc and right-assoc while the attribute par can be added to the function declaration to state that the function may be surrounded by parentheses in order to change its priority.

3.4.2 PSF/C in SDF

In this section we give the definition of the syntax of PSF/C using SDF.

module PSF/C

begin

lexical syntax

sorts

id-char, identifier, comment-l-char, comment-2-char

layout white-space, comment

functions [0-ga-zA-Z"'/]

id-char+

[ \n\t\r]

~ [\n]

-> id-char -> identifier

-> white-space

-> comment-l-char

312

~ [}3 "%" oon~ment-l-char* "\n"

"{" comment-2-char* "}"

context-free syntax

-> comment-2-char -> comment

-> comment

sorts design, component, scheme, signature, item, class, definition, predicate-body, function-body, process-body, set-body, assertion, comm-assertion, spec-assertion, term, action-term, term-list, process-expr, set-process-expr, ass-process-expr, set-expr, domain, varsort-list, varsort, assignment, scheme-var, sort-name, predicate-name, function-name, action-name, process-name, set-name, object-vat

functions

"DESIGN" {component ";"}* "SYSTEM" {scheme ","}*

"COMP" scheme-vat " : " scheme ":=" scheme "COMP" scheme-var ":" scheme "LET" scheme-var ":=" scheme

class "IMPORT" scheme "INTO" scheme "EXPORT" signature "FROM" scheme "LET" scheme-var ":=" scheme ";" scheme scheme-vat

{item "t"}* signature "+" signature item "^" signature "SIG" scheme

"SORT" sort-name "PRED" predicate-name ":" domain "FUNC" function-name ":" domain "->" sort-name "ACTION" action-name ":" domain "PROCESS" process-name ":" domain. "SET" set-name

"CLASS" definition* "END"

"SORT" sort-name "PRED" predicate-name ":" domain predicate-body

"FUNC" function-name ":" domain "->" sort-name function-body

"AXIOM" assertion "ACTION" action-name ":" domain "PROCESS" process-name ":" domain process-body "SET" set-name set-body "C0~4" con.-assertion "SPEC" spec-assertion

"IND" assertion "PAR" varsort-list "DEF" assertion "DEF" assertion

"IND" assertion

-> design

-> component -> component -> component

-> scheme -> scheme -> scheme -> scheme -> scheme

-> signature -> signature {left-assoc} -> signature

-> signature

-> item -> item -> item -> item -> item -> item

-> class

-> definition -> definition

-> definition -> definition -> definition -> definition -> definition -> definition -> definition

-> predicate-body -> predicate-body -> predicate-body -> predicate-body

-> function-body

313

"PAR" varsort-list "DEF" term

"DEF" term

"PAR" varsort-list "DEF" process-expr

"DEF" process-expr

"IND" assertion

"TRUE" "FALSE"

term "!" term "=" term predicate-name term-list set-name "[" action-term "]" "NOT" assertion

assertion ";" assertion assertion "AND" assertion

assertion "OR" assertion assertion "=>" assertion

assertion "<=>" assertion

"FORALL" varsort-list assertion

"EXISTS" varsort-list assertion "LET" {assignment "t"}* ";" assertion "(" assertion "}"

action-term "I" action-term "=" action-term comm-assertion ";" comm-assertion

"FORALL" varsort-list oomm-assertion

"(" comm-assertion ")"

process-name term-list "=" process-expr spec-assertion ";" spec-assertion

"FORALL" varsort-list spec-assertion "(" spec-assertion ")"

object-vat

function-name term-list

"THAT" varsort assertion

"LET" {assignment ","}* ";" term

action-name term-list

"(" {term ","}+ ")"

action-term "PRETAU"

"DELTA" process-name term-list

process-expr "." process-expr process-expr "+" prooess-expr process-expr "fl" process-expr

"GCMD" ass-process-expr "SUM" sum-merge-arg "MERGE" sum-merge-arg

"ENCAPS" set-proeess-expr "HIDE" set-process-expr "(" process-expr "}"

-> function-body

-> function-body

-> function-body

-> process-body -> process-body -> process-body

-> set-body

-> set-body

-> assertion

-> assertion -> assertion -> assertion


-> assertion -> assertion

-> assertion

-> assertion

-> assertion -> assertion


{left-assoc}

{left-assoc}

{left-assoc} {left-assoc}

{left-assoe}

{bracket}

-> comm-assertion

-> comm-assertion {left-asso -> comm-assertion

-> comm-assertion {bracket}

-> spec-assertion

-> spec-assertion {left-asso -> spec-assertion

-> spec-assertion {bracket}

-> term

-> term

-> term -> term

-> action-term

-> term-list {bracket} -> term-list

-> prooess-expr -> process-expr -> process-expr

-> prooess-expr -> process-expr -> process-expr -> proeess-expr

-> process-expr -> process-expr -> process-expr -> prooess-expr

-> process-expr -> process-expr

314

varsort-list "(" process-expr ")"

"(" assertion "," process-expr ")"

"(" set-expr "," process-expr ")"

set-name

set-expr "+" set-expr

set-expr "&" set-expr set-expr "^" set-expr

" (" set-expr ") "

{sort-name "#"}*

{varsort "r"}*

object-var ":" sort-name

object-vat ":=" term

identifier

identifier

identifier

identifier

identifier identifier

identifier

identifier

end PSF/C

-> sum-merge-arg

-> ass-process-expr

-> set-process-expr

-> set-expr -> set-expr

-> set-expr -> set-expr

-> set-expr

-> domain

-> varsort-list

-> varsort

-> assignment

-> scheme-vat

-> sort-name

-> predicate-name

-> function-name

-> action-name -> process-name

-> set-name -> object-var

4 SEMANTICS

4.1 Introduction

The semantics of the COLD-K language can be found in [FJKR87]. That semantical model will be used as a base to define the semantics of PSF/C. AII constructs in PSF/C that are already part of COLD-K have the same meaning as their counterparts in COLD-K. New constructs, i.e. all constructs dealing with process behaviour, are indirectly defined using the COLD-K semantics. This is done by giving a translation from PSF/C into COLD-K. The intention is to give a semantics to the process definition part that resembles the algebraic semantics normally attached to process algebra (see e.g. BERGSTRA & KLOP [BK84, BK86b]). In order to be able to understand the formal translation, we will give an overview of the usual algebraic semantics for process algebra expressions.

4.2 ACP

We start from a given set A of atomic actions. Atomic actions are the simplest kind of processes, indivisible, and usually considered as having no duration. Complex processes can be constructed from simpler ones by applying several predefined functions and operators. Each atomic action is a constant in the set Action. The set Action is embedded in the set of processes, named Process.

On A, we have given a partial binary function T, the communication function, y must be commutative and associative, i.e.

• ~a,b) = "~b,a)

~(a,T(b,c)) = ~ a , b ) , c ) (when def ined) fo r a l l a,b,c a A. I f T(a,b) = c, w e say a and b communicate, and the result o f the i r communication is c. If T(a,b) is undefined, we say that a and b do not communicate. A and T can be

315

considered as parameters of the theory: in each application we will have to specify what atomic

actions we have, and how they communicate. In PSF/C, we write ~a,b) = c as a Ib = c. On the d o m a i n of processes w e def ine an equiva lence relat ion by making a n u m b e r of identif ications be tween processes. These identifications fol low from a set of axioms. For all processes x and y e.g. we consider the processes x+y and y+x to be identical. The intuition behind the identifications wil l be explained next. The first two compositional operators we consider a re . , denoting sequential composition, and + for alternative composition. If x and y are two processes, then x-y is the process that starts the execution of y after the completion of x, and x+y is the process that chooses either x or y and executes the chosen process (not the other one). Each time a choice is made, we choose from a set of alternatives. We do not specify whether a choice is made by the process itself, or by the environment. Axioms A1-5 in table 1 below give the laws that + and • obey. We leave out • and brackets as in regular algebra, so xy + z means (x-y) + z. • will always bind stronger than other operators, and + will always bind weaker. On intuitive grounds x(y + z) and xy + xz present different mechanisms (the moment of choice is different), and therefore, an axiom x(y + z) = xy + xz is not included.

We have a special constant 8 denoting deadlock, the acknowledgement of a process that it cannot do

anything any more, the absence of any alternative. Axioms A6-7 give the laws for 8. We also have a

special constant t that is used for pre-abstraction (see the following section), t or 8 are not in the

given set A, but are in the set of constants Action. Thus, 7 is not defined for constants t, 8, which

means that t or 8 do not communicate.

Next, we have the parallel composit ion operator H, caged merge. The merge of processes x and y will interleave the actions of x and y, except for the communication actions. In x If y, we can either do a step f rom x, or a step from y, or x and y both synchronously perform an action, which together make up a new action, the communicat ion action. This tr ichotomy is expressed in axiom CM1. Here, we use two auxiliary operators [L (left-merge) and t (communication merge). Thus, xILy is x~y, but with the restriction that the first step comes from x, and x I y is x~y with a communication step as

the first step. Axioms CM2-9 and CF1-2 give the laws for [L and I. The laws CFI-2, that say that on

atomic actions I coincides with ~(, differ slightly from laws C1-3 in BERGSTRA & KLOP [BK84]. Finally,

we have in table 1 the encapsulation operator O H. Here H is a set of atomic actions (H _ A), and ~rl

blocks those actions, renames them into 8. The operator O H can be used to encapsulate a process, i.e.

to block communications wi th the environment. Since t ~ A, always 0H(t) = t.

x + y = y + x A1

( x+y )+z=x+ (y+z ) A2

X+X=X A3

(x +y)z =xz+ yz A4

(xy)z = x(yz)

x + 8 = x A6

8x=8 A7

a [ b = ~a,b) ff ~,(a, b) is def~.ed CF1

a I b = 8 otherwise CF2

x l ly= xll y + yl[x + x ly CM1

alLx=ax C:M2

axlly = a(xlly) CM3

(x + y)[Lz = x[Lz + y~Lz CM4

316

a lbx= (alb)x CM5

axlb = (alb)x CM6

axlby = (alb)(xlly) CM7

( x + y ) l z = x l z + Y l z

x l ( y + z ) = x l y + x l z C : ~

~.{(a) = a f l a s H D1

~-i(a) = S ffaeH D2

aH(X + y) = aH(X ) + aH(y ) D3

~)H(xY) = ~H(X)'0H(y) D4

Table 1. ACP.

In this table, a,b e Action (= Au{t,8}), H c_ A, and x,y,z are arbitrary processes. In addition to the axioms of ACP, we often use the following axioms of Standard Concurrency.

xl la = xa = allx so1

(xUy)llz = xll(yllz) s c 2

xlly = yUx SC3

Table 2. Standard Concurrency.

4.3 Pre-abstraction

In system verification, it is essential that we can abstract from the internal actions of a system, in order to prove that the external behaviour is as specified beforehand. Here, we are defining a specification language, and we do not want to deal with silent steps, and a suitable set of axioms for such steps. Thus, we are dealing with concrete process algebra (process algebra without silent steps. A first (importan0 step in dealing with internal actions can however be made in concrete process algebra, and this is that we can give all internal actions the same name. We use the constant t for this purpose. The unary operator ti will rename all atomic actions from the set I into t. We call the operator tI pre-abstraction and we sometimes caU the constant t pre-tau. These notions were introduced in BAETEN & BERGSTRA [BB88]. The axioms for tl are presented in table 3.

ti(a) = a f f a ~ I PT1

ti(a) = t ffaE I FF2

t i (x + y) = ti(x) + tI(y) PT3

t i(xy) = tI(x).tI(y) PT4

Table 3. Pre-abstraction.

4.4 Guarded command

We want to extend the axiom system ACP with generalized sum and generalized merge constructs.

In order to do this, it is very useful to introduce the guarded command construct first. If ~ is an

assertion in MPL, and p is a process expression, we write ~ : ~ p

for the process that is p ff ~ holds. If ~ does not hold, we get deadlock. Notice that the advantage of

the ~ :---~ p notation is exploited mainly in cases where ~ (and p) contains occurences of free variables for data. It is easy to write down the axioms for the guarded command. See table 4. Here, and in the

317

following sections, we will use the common logical connectives (^,v,~), which are defined for assertions in MPL.

~ b ~ ( 0 : - - + p = p )

- ~ (¢:-+ p = 8 )

~:--+ (V:-> p ) = ( ~ ^ ~ ) : ->p

( 0 v ~ ) :-+ p = (0 :-'+ p) + ( ~ :-+ p)

(x=t) :--+ p = (x=t):~ ,p[x:=t]

: ~ ( x . y ) = ( ~ : - + x ) . y

# : ~ ( x + y ) = ( ~ : ~ x ) + ( ~ : ~ y )

(~ :--+ x) L y = ~ :-> (x IL y )

(4}:-+ x) l y= ~:-+ ( x l y )

x I (4 :--> y)= ~ :~ (xl y)

aH(~ :~ x) = ~ :~ OH(x)

zI(~ :"+ x) = q :~ zI(X) ~ : ~ ( x . y ) = ( ~ : - + x ) . ( ~ : ~ y )

Table 4. Guarded Command.

GCI

GC2

GC3

C~4

CW_5

GC6

GC7

GC8

GC9

GC10

GC11

GC12

GC13

Example: we can define the if...then...etse construction by:

if ~ then p else q = ~ :--> p + ~ : ~ q .

4.5 Generalized sum and merge

In order to give some motivation for what is to follow, we discuss an example first. Consider a one- place buffer with one input port and two output ports, called O and E. Atomic actions are parameterized by natural numbers, elements of the data sort N. We have the actions in(n), outO(n)

and outE(n) for each ne N. The buffer will output all odd numbers received at port O, all even numbers at port E. A recttrsive equation for this buffer can be given as follows:

Buf= Z in(n).outO(n) + Z in(n).outE(n).

naN nan nodd neven

Now the advantage of the guarded command introduced in 4.4 is, that we can rewrite this as

follows:

BUf = Z (n odd) :-~ in(n).outO(n) + Z (n even) :--> in(n)-outE(n). hEN neN

This makes that we need to describe the generalized sum and merge constructs with only two arguments: first, a list of variables with sort names, and second a process expression. If x is a list of

variables, and D__ a list of sort names of same length, then we write x e D to denote that a variable in list x is an element of the corresponding sort name in list D. Then, the form of the sum and merge constructs is as follows:

p respectively II p' x~

where variables from x may occur in p.

Moreover we introduce as abbreviations:

318

The following holds:

X P = ~ P and II p -- II p _xE D ~E P_,T

x~ D x~ D D_,T

xSDj~ x~D__,T xE!~

II P -- II -- II

Axioms for these constructs are non-trivial, but giving axioms is facilitated by using the guarded command of the previous section, We give the sum axioms in fable 5.

o ~ : - ~ p + ~ -~ : -~p SUSSUM x~O xeD x~O

X 0~--'I):-'~P = P[~:=I] ff no x occurs f~ee in _t SINGSUM xeD

Table 5. Generalized sum.

Actually, in the translation to COLD-K, to be presented in section 4.6, we wil l use a different axiomatization of generaliTed sum, one that is easier to code in COLD. The axioms in table 5 are sufficient to prove that each finite sum behaves as repeated applications of alternative composition (in fact, only assertions of the form x:=t are needed). We give an example: suppose we have the booleans B with constants TRUE and FALSE. Then:

E p(x) = X (x=TRUE) :--~ p(x) + X (x=FALSE) :--~ p(x) (by SUBSUM) xcB x~B x~B

= p(TRUE) + p(FALSE) (by SINGSUM)

A usehd additional axiom is the following axiom, which we can call FLATSUM:

~ p = p if no __. occurs free in p FLATSUM xe_D

Table 6.

In order to deal with infinite sums, we need two additional axioms: ACTSUM, that says that any action performed by a sum construct must be an action of one of its summands, and the axiom of extensionality EXT, that says that a process is determined by its summands. These axioms are presented in table 7.

Y.p= Y , p + a ~ BxeD(p=p+a) ACTSUM1 _xeD _xe__D

~ p = ~ p + a . r ~ 3x~D(p=p+a.r) x~D xED

no_x freein r ACTSUM 2

Va~A(p--p+a ¢m q = q + a ) ^

Va~A, Vr(p=p+a-r ¢~ q - q + a . r ) ~ p = q EXT

Table 7. Infinite sums, extensionality.

319

The axioms for finite merge are similar to the axioms in table 5. We give them in table 8. Notice

that we can derive that each empty sum is equal to 8, and the empty merge is defined to be 8 in table 8. In order to deal with infinite merges, we can have an axiom similar to ACTSUM in table 7. We prefer, however, not to do this, since one may hold the viewpoint that infinite merges do not occur "in reality". In this viewpoint, each infinite merge will equal CHAOS. Our theory here will not make a choice one way or the other.

V x ~ O ~ < , ~ IIP = 8 EMPTYMERGE

xED_, ¢

II P = II P II II P SUBMERGE

N P = P L~=t-] if no x occurs freeint SINGMERGE

x_E_D~_=t

Table 8. Generalized merge.

4.6 Translation to COLD-K

In this section we will only give some ideas behind the translation of PSF/C into COLD-K. The full description can be found in [BBMVg0] since this is too technical to be presented in full detail here. The translation is based on the concrete syntax of PSF/C which is passed through recursively. Each class in PSF/C is translated into a class in COLD-I~ Moreover there is a predefined class called BASIC that includes definitions for items that are used in the translation. This class contains two sorts: Action and Process, and functions that operate on elements of these sorts like alternative and sequential composition.

CLASS

° . °

SORT Process

SORT Action

FUNC alt : Process # Process -> Process

FUNC seq : Process # Process -> Process

. ° .

END;

All objects from PSF/C are translated into COLD-K objects, e.g. a process is translated into a function of sort Process. In the example below the left shows part of the PSF/C specification and the right the translation into COLD-K.

CLASS CLASS

. . . . ° .

ACTION r : D FUNC r : D -> Action

PROCESS read FUNC read : -> Process . . . . . .

END; END;

320

The class BASIC also includes the axioms presented in sections 4.2 through 4.5. The translation of axiom A3 ( x = x + x ) in table I of section 4.2 is the following:

AXIOM FORALL x:Process (

alt(x,x) ~ x

)

Using these axioms we impose the equality relation on the sort Process and thus the semantics of the processes is defined. In this way we have reused the semantics of COLD-K to define a semantics of PSF/C.

5 EXAMPLES

In this section we give some examples of a specification in PSF/C, which illustrate the use of simple data types, process definitions and the concept of parameterization. The examples deal with vending machines, a landing control system for an airport and the alternating bit protocol.

5.1 A Vending Machine

5.1.1 The Problem

In this first example, adapted from MAUW & VELTINK [MV89], we want to specify a vending machine that sells tea and coffee. In fact this is a very simple machine, for it only accepts two kinds of coins, 10c coins and 25c coins, it does not give any change and there are no buttons to choose between coffee or tea. The choice is determined by whichever coin is inserted.

5.1.2 The Implementation In our example we have used just one class, called VENDING_MACHINE_ANDUSERS, to specify the vending machine. Firstly, we define all atomic actions that occur in the specification. The atomic actions fall apart into three categories. These categories are the actions of the vending machine, the action of the customer and the actions that are the result of a communication between the customer and the vending machine. In the COMM section we define all possible pairs of actions that can communicate with each other and we specify what the resulting action will be. This implicitly implies that all communications that are not listed here are prohibited. Next we define a set of atomic actions called H. This set contains all atomic actions that are performed by either the machine or the customer. Its use will show up later on. After having defined the atomic actions and the communication function we are able to specify the processes. The first process is called VMCT and represents the vending machine. InitiaUy it offers the choice of a insert_10c or a insert_ZSc action, after which it continues to serve tea or coffee. After having served a drink VMCT returns to its initial state. The two next processes define a customer who wants tea and a customer who wants coffee. The last process defines the combination of the three previously defined processes. The vending machine is operating in parallel with the customers, in this example it serves a Tea_User followed by a CoffeeUser, in that specific order. The ENCAPS operator forbids the atomic actions listed in H to occur on their own and such forces communication.

5.1.3 The Specification

%

% A very simple vending machine with two users. %

LET VENDING MACHINE AND USERS :=

321

CLASS ACTION insert 10c

ACTION accept_10c

ACTION 10cloaid

ACTION insert 25c ACTION accept__25c ACTION 25cpaid ACTION serve tea ACTION take_tea ACTION tea delivered ACTION serve coffee ACTION take coffee ACTION coffee delivered :

COW insert_10c I auceptl0c = 10c_paid;

insert_25c [ accept_25c = 25c_paid;

servetea I take_tea = tea_delivered; serve_coffee ~ take_coffee = coffee_delivered

SET H IND H (insert_10c) ; H (accept_10c) ; H (insert_25c) ; H (accept_25c) ;

H (serve_coffee) ;

H (take_coffee) ;

H (serve_tea) ;

H (take_tea)

PROCESS VMCT : DEF ((accept_10c . serve_tea) +

(accept_25c . serve_coffee)} . VMCT;

PROCESS Tea User :

DEF insertS0c . take tea;

PROCESS Coffee User : DEF insert_25c . take_coffee;

PROCESS System : DEF ENCAPS(H, VMCT I I ( Tea User . Coffee_User ))

END;

VENDING MACHINE AND USERS

5.2 A Landing Control System

5.2.1 The Problem In the next example, adapted from MAUW & VELTINK [MV88], we specify a hypothetical landing control system for an ai~ort . It is designed to handle the landing of a number of airplanes on a number of landing strips. The system consists of a number of parallel operating subsystems, first of which is the Distribution process. The other processes, the StriF_Controllers, all have the same behaviour. Each of them has control over exactly one landing strip.

322

. . . . . . . . . . . . . . . . . . . . .

figure 1. Timbuktu Airport

5.2.2 The Implementation The class Landing_Control is parameterized by the class Airport. This class consists of the two sorts Strips, containing the names of the landing strips, and Plane_Ids, containing the id's of all planes potentially willing to land. The LandingControi exports the atomic action receive-req-to-land, which enables the system to communicate with arriving airplanes, and the process Control, which is the name of the overall process being specified. Internal to this class are a number of atomic actions. The atoms read, send and communicate are used to model the communication between the process Distribution and each of the Strip_Controllers. The Strips argument determines which Strip_Controller is involved, and the Plane_Ids argument indicates the plane that should be landed. As is indicated in the communications section, placing the atoms send and read in parallel yields the atom communicate. The set H, containing the read and send actions will be used to encapsulate unsuccessfu/ communication. This happens when the read and send actions do not have a partner to communicate with. The other atomic actions, land and disembark, are not intended to take part in a communication. Apart from the Control process we define three processes. The process Distribution receives a request to land from some plane and sends its id to one of the Strip_Controllers, which is willing to communicate with the Distribution. After that, the Distribution process starts all over again. The process Strip_Control is indexed with the name of some Strip. In fact it defines a new process for each Strip. It starts by receiving a message from the Distribution to handle a plane with a given id. After handling this plane, as defined by the process Handle, the Strip_Controller starts all over and is again able to receive a plane-id. The process Handle serves as a sub-process of the process Strip_Control. The second argument determines the plane and the first one determines the Strip the plane must land on. This process stops after landing and disembarking the plane. Finally the overall process Control is defined as the concurrent operation of the Distribution and all Strip_Controllers. The encapsulation operator removes unsuccessful communications.


%

% Airport conditions local to Timbuktu-airport %

LET TIMBUKTU_AIRPORT :=

CLASS SORT Strips SORT Plane Ids FUNC North : -> Strips

323

FUNC East : -> Strips FUNC South : -> Strips

FUNC West : -> Strips

FUNC KL204 : -> Plane Ids

FUNC SQ001 : -> Plane Ids FUNC JL403 : -> Plane Ids

FUNC PA666 : -> Plane Ids FUNC HA345 : -> Plane Ids

END;

%

% The landing control system for Timbuktu airport. %

LET TIMBUKTU LANDING CONTROL :=

EXPORT SORT Plane_Ids, ACTION receive req to land : Plane_Ids,

PROCESS Control :

FROM

IMPORT TIMBUKTU AIRPORT INTO

CLASS

ACTION receive req_to_land : Plane Ids ACTION read : Strip~ # Plane Ids

ACTION send : Strips # Plane Ids

ACTION communicate : Strips # Plane Ids

ACTION land : Strips # Plane Ids ACTION disembark : Plane Ids

COMM FORALL s:Strips, id:Plane Ids

(send(s,id) I read(s,id) = communicate(s,id))

SET H

IND FORALL s:Strips, id:Plane Ids (

H (read (s, id) ) ;

H (send (s, id) ) )

PROCESS Distribution : DEF SUM id:Plane Ids (receive_req_to land(id)

SUM s:Strips (send(s,id))

) . Distribution

PROCESS StripControl : Strips PAR s:Strips

DEF SUM id:Plane Ids (read(s,id) . Handle(s,id)

) . Strip_Control(s)

PROCESS Handle : Strips # Plane Ids PAR s:Strips, id:Plane Ids

DEF land(s, id) . disembark (id)

PROCESS Control : DEF ENCAPS(H, Distribution ~

MERGE s : Strips (Strip_Control (s)) ) )

END;

TIMEUKTU LANDING CONTROL

324

5.3 Alternating Bit Protocol

5.3.1 The Problem One of the most famous communication protocols is the Alternating Bit Protocol (ABP). It has been used many times to serve as a test case for a new specification formalisn~ Our specification emanates from the ABP specification in ACP as described in BERGSTRA & KLOP [BK86a, BK86b]. We can represent the Alternating Bit Protocol with a picbare as follows:

~np=t

S R

r l ,1~

. i l L .. . . .

i K

figure 2 Graphical representation of the Alternating Bit Protocol

It consists of four components:

• S : The sender.

• R : The receiver.

• K : A channel connecting the sender and the receiver.

• L : A channel connecting the receiver and the sender. The goal of the Alternating Bit Protocol is to transport data items from a certain set D from the input port to the output port. In the next paragraphs we will give a description of each component.

5.3.1.1 The Sender First, component S reads a message at the input port. This message is extended with a control boolean to form a so-caned frame and this frame is sent along channel K (3). The sending of the frame proceeds until component S receives an acknowledgement of a successful transmission at channel L (6). After a successful transmission component S flips the control boolean and starts all over again.

5.3.1.2 Communication Channel K

Component K transmits frames from the sender (3) to the receiver (4). There are two situations that can occur when sending information along channel K.

• The frame is properly transmitted.

• The frame is corrupted during the transmission. We assume channel K to be fair, Le, it will not produce an infinite stream of corrupted data.

5.3.1.3 The Receiver The receiver R reads a frame from channel K (4). We assume that R is able to teU, e.g. by performing a checksum control, whether or not the frame has been corrupted. When the frame is correct R checks the control boolean in the frame. If this control boolean matches t h e internal control boolean of K, the message in the frame is sent to the output port, K flips its internal boolean and starts waiting for the next frame to arrive. In all other cases R sends the complement of its own control boolean along channel L (5) and waits for the retransmission of the frame.

5.3.1.4 Communication Channel L

325

Component L is used to transmit receive acknowledgements from the receiver (5) to the sender (6). Like channel K, channel L is able to corrupt data. We will assume that the sender S can tell whether an acknowledgement has been corrupted. We assume that channel L is fair too.

5.3.2 The Implementation The specification of the Alternating Bit Protocol starts of with some classes from the COLD IGLOO (Incremental Generic Library Of Objects), which has been collected at Philips Research, Eindhoven. These classes are BOOL_SPEC and FRAME SPEC. The first one defines the booleans and the second one is a modification of TUP2 SPEC, which defines tuples of data types. In this case FRAME_SPEC defines a tuple of two booleans. Next come the classes that are specific for this application. At first we have to model the unreliable channels of the protocol. Channels K and L are fairly similar, the only difference is that channel K transports frames while L transports booleans. There are three atomic actions involved with the definition of the unreliable channels: a read and a send action, and an error action indicating malfunctioning of the channel. The sender S and the receiver R are specified in SENDER_SPEC and RECEIVER SPEC respectively. Now that we have defined the separate objects of the system, we have to glue them together. This is done in the class ABP_SPEC. The specification of the sender, the receiver and the unreliable channels are imported and at the same time the resulting atoms of the communication between the several objects of the system are defined. The last thing we have to do is to supply two objects, one at either side of the ABP process. One that supplies an infinite stream of random booleans, RANDOM_SPEC, and one that is able to read an infinite stream of random booleans, DRAIN_SPEC. In the final class ABP_SYSTEM_SPEC we tie together the RANDOM_SPEC, ABP_SPEC and DRAIN_SPEC.


%

% Name : B00L SPEC % Date : 09/03/88 %

% Description : %

% This is a specification of the data type of booleans with

% inductive definitions for the non-constructor operations.

LET BOOL SPEC :=

EXPORT

SORT

FUNC

FUNC

FUNC

FUNC

FUNC

FUNC

FUNC

FUNC

FROM

CLASS

Bool,

true : -> Bool,

false : -> Bool,

not : Bool -> Bool,

and : Bool # Bool -> Bool,

or : Bool # Bool -> Bool,

imp : Bool # Bool -> Bool,

eqv : Bool # Bool -> BO01,

xor : Bool # Bool -> Bool

SORT Bool

FUNC true :-> Bool

FUNC false :-> Bool

326

AXIOM

{BOOLI) true!;

{BOOL2} false!;

{BOOL3} NOT true = false

PRED is_gen : Bool

IND is_gen(true);

is gen (false)

AXIOM FORALL b:Bool

{BOOL4} is_gen(b)

FUNC not: Bool -> Bool

IND not(true) = false;

not(false) = true

FUNC and: Bool # Bool -> Bool

IND FORALL b:Bool

( and(false, b) = false;

and(true,b) = b }

FUNC or: Bool # Bool -> BOO1

IND FORALL b:Bool

(or(false,b) = b;

or(true, b) = true )

FUNC imp: Bool # Bool -> Bool

IND FORALL b:Bool

(imp(false,b) = true;

imp(true,b) = b )

FUNC eqv: Bool # Bool -> Bool

IND FORALL b:Bool, C:Bool

( b = c => eqv(b,c) = true;

NOT b = c => eqv(b,c) = false )

FUNC xor: Bool # Bool -> Bool

IND FORALL b:Bool, c:Bool

( b = c => xor(b,c) = false;

NOT b = c => xor(b,c) = true )

END;

%

% Name : FRAME SPEC

% Date : 10/03/88 %

% Description : %

% This is an axiomatic specification of the 2-tuple data type

% with inductive definitions for the non-constructor operations.

LET FRAME SPEC :=

EXPORT

SORT Frame,

SORT Bool,

FUNC frame : Bool # Bool -> Frame,

FUNC projl : Frame -> Bool,

327

FUNC proj2 : Frame -> Bool

FROM IMPORT BOOLSPEC INTO

CLASS

SORT Frame DEP Bool FUNC frame : Bool # Bool -> Frame

AXIOM FORALL il:Bool, jl:Bool, i2:Bool, j2:Bool ( {TUPI} frame(il, i2)!; {TUP2} frame(il, i2) = frame(jl, j2) => il = jl AND i2 = j2 )

PRED is gen: Frame

IND FORALL il:Bool, i2:Bool ( is_gen(frame(il,i2)) )

AXIOM FORALL t:Frame

{TUP3} is gen(t)

FUNC projl: Frame -> Bool

IND FORALL il:Bool, i2:Bool (

projl(frame(il, i2)) = il )

FUNC proj2: Frame -> Bool IND FORALL il:Bool, i2:Bool (

proj2(frame(il,i2)) = i2 )

END;

%

% This is a specification of an unreliable channel that

% either transports one item from its input to its output,

% or generates an error stating malfunctioning %

LET UC K_SPEC :=

EXPORT

SORT Frame,

PROCESS UC K: ,

ACTION K read: Frame ,

ACTION K send: Frame ,

ACTION K error: FROM

IMPORT FRAME SPEC INTO

CLASS

ACTION Kread: Frame

ACTION K send: Frame ACTION K error:

PROCESS UCK: DEF SUM d:Frame (K_read(d) . UC_K(d));

PROCESS UC K: Frame PARd:Frame DEF (skip . K_send(d) + skip . K_error) . UCK

END;

328

%

% This is a specification of an unreliable channel that

% either transports one item from its input to its output,

% or generates an error stating malfunctioning %

LET UC L SPEC :=

EXPORT

SORT Bool,

PROCESS UC L: ,

ACTION L read: Bool ,

ACTION L send: Bool ,

ACTION L error:

FROM

IMPORT BOOL SPEC INTO

CLASS

ACTION L read: Bool

ACTION L send: Bool

ACTION L error:

PROCESS UC_L:

DEF SUM d:Bool (L_read(d) . UC L(d));

PROCESS UC L: Bool

PAR d:Bool

DEF (skip . L_send(d} + skip . L error) . UC L

END;

%

% This is a specification of the sender of the Alternating Bit Protocol %

LET SENDER_SPEC :m

EXPORT

SORT Frame,

SORT Bool,

PROCESS S : ,

ACTION read item: Bool ,

ACTION send_frame: Frame ,

ACTION read ack: Bool ,

ACTION read ack error:

FROM



CLASS

ACTION read item: Bool

ACTION send frame: Frame

ACTION read_ack: Bool

ACTION read ack error:

PROCESS S :

DEF RM(false)

329

PROCESS RM : Bool

PAR b : Bool

DEF SUM d:Bool (read_item(d) . SF(d,b) )

PROCESS SF : Bool # Bool

PAR d:Bool, b:Bool

DEF send frame(frame(d,b) ) • RA(d,b)

PROCESS RA : Bool # Bool

PAR d:Bool, b:Bool

DEF (read_ack(not(b)) + read ack error)

+ read ack(b) . RM(not(b))

END;

• SF (d,b)

%

% This is a specification of the receiver of the Alternating Bit Protocol %

LET RECEIVER SPEC :=

EXPORT

SORT Frame,

SORT Eool,

SORT Bool,

PROCESS R : ,

ACTION send item: Bool ,

ACTION read frame: Frame ,

ACTION send ack: Bool ,

ACTION read_frame error:

FROM



CLASS

ACTION send item: Eool

ACTION read frame: Frame

ACTION send ack: Bool

ACTION read_frameerror:

PROCESS R :

DEF RF(false);

PROCESS RF : Bool

PAR b:Bool

DEF (SUM d:Bool (read_frame (d, not (b)) ) + read frame_error)

• SA (not (b))

+ SUM d:Bool (read_frame (d, b) . SM(d,b) }

PROCESS SA : Bool

DEF send_ack (b) . RF (not (b))

PROCESS SM : Bool # Eool

PAR d:Bool, b:Bool

DEF send_item(d) ° SA(b)

END;

330

%

% This is a specification of the Alternating Bit Protocol, which

% combines all previously defined classes into one system %

LET ABP SPEC :=

EXPORT

SORT Bool, PROCESS ABP : , ACTION read item : Bool , ACTION send item : Bool

FROM

IMPORT BOOL SPEC INTO IMPORT FRAME SPEC INTO

IMPORT UC K SPEC INTO

IMPORT UC L SPEC INTO

IMPORT SENDER SPEC INTO

IMPORT RECEIVER SPEC INTO

CLASS

ACTION frame error :

ACTION ack_error : ACTION ack enters_channel : Bool ACTION ack leaves channel : Eool

ACTION frame enters channel : Frame

ACTION frame leaves channel : Frame

COMM

K__error ~ read frame_error = frame_error;

L error ~ read ack error = ack error

COMM FORALL b:Bool ( send ack(b) I L_read(b) = ack enters channel(b);

L_send(b) I read_ack (b) = ack_leaves_channel (b) )

COMM FORALL f:Frame ( send frame (f) I K_read (f) = frame_enters_channel (f) ;

K_send(f) I read_frame(f) = frame_leaves_channel(f) )

SET H IND FORALL d:Bool, b:Bool, f:Frame (

H (K_error) ;

H (read_frame_error) ; H (L_error) ; H(read ack error); H (read_item (d)) H(send item(d)) H (send_ack (b)) ; H(read ack(b)) H (L_read (b)) H (Lsend (b)) ; H(send frame(f)); H(K read(f)); H (read frame (f)) ; H(K send(f)) )

331

PROCESS ABP :

DEF ENCAPS(H, S I] R II UC K I] UC_L)

END;

%

% This is a specification of a process that produces a random stream

% of booleans %

LET RANDOM SPEC :=

EXPORT

SORT Bool,

PROCESS RANDOM : ,

ACTION output : Bool

FROM


CLASS

ACTION output : Bool

PROCESS RANDOM :

PAR d:Bool

DEF SUM d:Bool (SKIP . output(d)) . RANDOM )

END;

%

% This is a specification of a process consuming booleans %

LET DRAIN SPEC :=

EXPORT

SORT Bool,

PROCESS DRAIN : ,

ACTION input : Bool

FROM


CLASS

ACTION input : Bool

PROCESS DRAIN :

PAR d:Bool

DEF SUM d:Bool (input(d)) . DRAIN )

END;

332

%

% Here the total system is created by linking together the subsystems. %

LET ABP SYSTEM SPEC :=

EXPORT

PROCESS ABP_SYSTEM : FROM

IMPORT ABP SPEC INTO IMPORT DRAIN SPEC INTO IMPORT RANDOM SPEC INTO

CLASS

ACTION item read : Bool ACTION item sent : Bool

COMM FORALL d:Bool (

output (d) ] read_item (d) = item_read (d) ; send_item(d) I input (d) = item_sent (d))

SET H IND FORALL d:Item (

H (output (d)) ; H (input (d)) ; H(read item(d) ) ; H(send item(d)) )

PROCESS ABP SYSTEM : DEF ENCAPS(H, RANDOM I[ ABP If DRAIN)

END;

ABP SYSTEM SPEC

6 EXTENSIONS

A number of possible extensions of PSF/C come to mind, most of them concerning the modularization of classes and the addition of extra process composition operators. We mention a few of them. Firstly, like in fu]l COLD we could add parameterization concepts and renamings on the class level Secondly, instead of having only two simple renaming operators, on the process specification level,

viz. encapsulation (that renames a set of atomic actions into 8, leaving other actions fixed) and pre- abstraction (renaming into 0, we can allow general renaming operators, having an operator pf for each function f from A into the set Action. For more details, see BAETEN & BERGSTRA [BB88]. In this paper, also generalized renaming operators can be found, most notably the state operator, with which we can keep track of the state of a process during execution. This operator finds applications in the translation of programming languages or specification languages into process algebra.

Another issue is the addition of the silent step ~. This process is necessary for system verification. On the other hand, addition of a silent leads to complicated issues, one of which is the exact formulation of axioms. The concrete language ACP has remained fixed over a number of years, so is fairly weU-established, and moreover is amenable to term rewriting analysis. There are several other operators that can be added to PSF/C and wiil ease specifications. We can think of the mode transfer operator, the priority operator, determination of alphabets, process creation operator, etc.

333

The semantics of PSF/C can also be given in a different way than was presented here. Notably, it is possible to give an operational semantics with Plotkin-style rules, by defining a COLD predicate arr0m on \Process # \Action # \Process, with all rule definitions translated into COLD axioms.

7 COMPARISON OF PSF/C WITH SIMILAR LANGUAGES

The most obvious candidate for comparison is PSF/ASF as it was described in [MV88]. The difference is that the data type specifications are now given in the way of COLD. Moreover the concrete syntax of the process declarations is formatted in the style of COLD. (In the case of PSF/ASF the process declarations were formatted in the style of ASF.) Because we wanted to use the data type specifications from COLD only the static fragment of it has been imported into PSF/C. It is an open question for us how the dynamic part of COLD could be combined with ACP. There seems to be an inherent overlap between the procedures in COLD and the processes of ACP. Due to this overlap an orthogonal language design based on a combination of COLD and ACP seems diffioalt to obtain.

The reason to consider a combination of ACP with COLD rather than with ASF is threefold: (i) It is easier to base process declarations on data types specifed with first order formulae than on

types that are algebraically specified using initial algebra semantics. Indeed for the precise definition of guardedness for systems of recursion equations negative information (i.e. information about expressions denot ing different data) is essential. COLD allows the use of full first order specifications. The induction scheme of COLD also allows the restriction of data algebras to so-called minimal (term generated) algebras. So the expressive power exceeds that of ASF for all practical purposes. Of course there is a price to be paid: automatic specification and implementation of COLD specifications is not an easy matter. It is essentially harder than for the algebraic specifications of ASF

(ii) The major strong point of COLD is its modularisation mechanism. The power of that mechanism is already fully present in the static part. We observed that by simply adopting COLD for data type declaration, and using the same modularisation mechanisms also in the presence of process declarations one obtains a language for which a semantics can be defined in just the same way as for COLD. Indeed the meaning of PSF/C constructs is found by translating these into theories in the infinitary many sorted partial logic (as it was done in [FJKR 87]). For notational reasons this translation is found via an intermediate translation of PSF/C into COLD. It should be noted, however, that this mechanism can in principle be used to obtain a semantic description of PSF/ASF as welL That would require a meticulous and unpleasant translation of ASF into COLD however.

(ifi) We are interested in the relation (and possible combinations) of COLD and ACP. It seems to be the obvious point of departure to begin with a language definition that combines COLD and ACP in the same way as LOTOS combines Act-one and CCS.

LOTOS (Language of Temporal Ordering Specification) [ISO87] is one of the two Formal Description Techniques, developed within ISO for the formal specification of open distributed systems, in particular for those related to the Open Systems Interconnection (OSI) computer architecture. Differences with PSF/C are: (i) bias towards CCS instead of ACP, (ii) COLD syntax is replaced by Act One, (ifi) though modularisation concepts are available for the data type specifications, they are not for the process part as opposed to PSF/C where there is no distinction between data and process parts, and import /expor t constructs are offered for both. (iv) the semantics is given in terms of transition systems.

In MORELL MEERFORDT [Mot88], a s~ tac t ic combination of CSP and Meta IV is presented, The specification language is proposed and illustrated by examples. The main point is that processes can be parameterized by data structures. A systematic translation into Ada exists for this formalism.

(Differences with PSF/C: (i) bias towards CSP instead of bias towards ACP, (ii) there seems to have been paid less attention to modularisation, and of course (ifi) COLD syntax is replaced by Meta IV. The difference between these formats is minimal for fiat specifications, i.e. specifications without explicit modular structure.

334

No particular semantical model is selected to describe the semantics of the CSP/Meta IV combination. Probably the author has transition systems in mind.

ASTESIANO, MASCARI, REGGIO & WIRSING [AMRW85] describes the formalism SMOLCS for specifying concurrent systems. Differences with PSF/C are the following: (i) SMOLCS is biased towards CCS rather than to ACP, the semantics is presented in terms of transition systems (ii) although SMOLCS uses an algebraic formalism for data type specification (as does PSF/ASF from [MV88]) the semantic intuition is quite different because SMOLCS inherits the orientation towards hierarchical specifications that was proposed by the Munich School.

Although not apparent from the syntax one might say that SMOLCS is closer to LOTOS than to PSF/C

FOREST is a specification language that has been developed at the Imperial CoUege in London by a team around Tom Maibaum, see GOLDSACK [G88]. The language uses deontic logic to express (potential) system behaviour. The behaviour of agents is formalized in terms of modal action logic. The data are described in terms of a first order language based on the declaration of structured signatures. The semantics of the agents is given in the context of trace theory. The formalism FOREST provides a combination of data type specifications and process (agent) specifications just as PSF/C does. The main difference is that FOREST uses a process logic, whereas PSF/C uses a process algebra. The data type specifications of FOREST seem in fact to be comparable with the possibilities of static COLD as it is used in PSF/C.

8 CONCLUSION

In the construction of the language PSF/C, the design objectives stated in the introduction have been met. A few additional remarks:

, we found that the translation of the process constructions to COLD is cumbersome, and it is our preliminary conclusion that the resulting insights do not justify the effort.

o the SDF system suffices to generate simple tools for the language;

o we obtained a COLD oriented language in which certain comparative advantages of COLD over ASF are preserved. Thus, PSF/C has greater expressive power than PSF/ASF, and a more flexible semantic theory;

REFERENCES

[AMRW85]

[AUT"/]

[BB88]

[BBMV90]

[BHK891

t~K84]

E~stesiano, G.F.Mascari, G.Reggio, M.Wirsing, On the parametrised algebraic specification of concurrent systems, Proc. 10th Colloquium on Trees in Algebra and Programming (TAPSOFr), LNCS 185, pp. 342-358, Springer Verlag, 1985.

A.V. Aho & J.D. Ullman, Principles of Compiler Design, Addison-Wesley, Reading, Massachusetts, 1977.

J.C.M. Baeten & J.A. Bergstra, Global renaming operators in concrete process algebra, Inf. & Comp. 78 (3), 1988, pp. 205-245.

J.C.M. Baeten, J.A. Bergstra, S. Mauw & G.J. Vettink, A process specification formalism based on static COLD, report P8906b, Programming Research Group, University of Amsterdam 1990.

J.A. Bergstra, J. Heering & P. Klint (eds.), Algebraic specification, ACM Press Frontier Series, Addison-Wesley 1989.

J.A. Bergstra & J.W. Klop, Process algebra for synchronous communication, Information & Control 60, 1984, pp. 109-137.

335

~ 6 a ]

[BI<86b]

[FJI<R871

[G88]

~.86]

[HK891

[Job79]

[LS79]

[Mor88]

[MV88]

[MV89]

[RdL89]

[Rek87]

[WB89]

J.A. Bergstra & J.W. Klop, Verification of an alternating bit protocol by means of process algebra, in: Math. Methods of Spec. & Synthesis of Software Systems '85, (W. Bibel & I<.P. Jantke, eds.), Math. Research 31, Akademie-Verlag Berlin, pp 9-23, 1986.

J.A. Bergstra & J.W. Klop, Process algebra: specification and verification in bisimulation semantics, in: Math. & Comp. Sci. IL (M. Hazewinkel, J.K. Lenstra & L.G.L.T. Meertens, ecls.), CWI Monograph 4, pp 61-94, North-Holland, Amsterdam, 1986.

L.M.G. Feijs, H.B.M. Jonkers, C.P.J. Koymans & G.R. Renardel de Lavalette, Formal Definition of the Design Language COLD-K, METEOR/tT/PRLE/7, 1987.

S.J.Go|dsack, Specification of an operating system kernel : FOREST and VDM compared, in: VDM'88 (R.Blomfield, L.Marsha]l, R.Jones eds.) LNCS 328, pp. 88- 100, Springer Verlag, 1988.

J. Heering & P. Klint, A syntax definition formalism, Report CS-R8633, Centre for Mathematics and Computer Science, Amsterdam, 1986.

J. Heering & P. Klint, A syntax definition formalism, in [BHK89], pp. 283-298.

International Organization for Standardization, Information processing systems - Open systems interconnection - LOTOS - A Formal Description Technique Based on the Temporal Ordering of Observational Behaviour, ISO/TC 97/SC 21, (E. Brinksma, ed.), 1987.

S.C. Johnson, YACC: yet another compiler-compiler, in: UNIX Programmer's Manual, Volume 2B, pp. 3-37, Bell Laboratories, 1979.

M.E. Lesk & E. Schmidt, LEX - A Iexical analyzer generator, in: UNIX Programmer's Manual, Volume 2B, pp. 39-51, Bell Laboratories, 1979.

H. Morell Meerfordt, Combining CSP and Meta IV into an Ada Related PDL for developing Concurrent Programs, in: Ada in Industry, The Ada companion series (S. Heilbrunner, ecL), Cambridge University Press, pp. 157-171, 1988.

S. Mauw & G.J. Veltink, A process specification formalism, report P8814, Programming Research Group, University of Amsterdam 1988.

S. Mauw & G.J. Veltink, An introduction to PSFd, in: Proc. International Joint Conference on Theory and Practice of Software Development, TAPSOFT '89, (J. Dfaz, F. Orejas, eds.) LNCS 352, pp. 272-285, Springer Verlag, 1989.

G.IL Renardel de Lavalette, COLD-K 2 , the static kernel of COLD-K, Report RP/mod-89/8, Software Engineering Research Centrum, Utrecht, 1989.

J. Rekers, A Parser Generator for finitely Ambiguous Context-Free Grammars, Report CS-8712, Centre for Mathematics and Computer Science, Amsterdam, 1987.

M. Wirsing, J.A. Bergstra, ecls., The Design Language COLD, section II in: Algebraic Methods: Theory, Tools and Applications, LNCS 394, pp. 139-328, Springer Verlag, 1989.

PART IV Algebraic Specification

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

Specification of the Transit Node in PSFd . . . . . . . . . . . . . . . . . . . . 341

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 2 PSFd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 3 The transit node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 4 Design of the specification . . . . . . . . . . . . . . . . . . . . . . . . 343 5 The specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 6 Relation to the ERAE specification . . . . . . . . . . . . . . . . . . 358 7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Design of a Specification Language by Abstract Syntax Engineering . . . . . . 363

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 2 Abstract syntax design . . . . . . . . . . . . . . . . . . . . . . . . . 365 3 An open problem about the methodology of language design . . . . . 366 4 On the role of software support for algebraic specifications in abstract

syntax engineering . . . . . . . . . . . . . . . . . . . . . . . . . . 369 5 Preliminaries for the BMASF specification . . . . . . . . . . . . . . . 370 6 Specification of BMASF . . . . . . . . . . . . . . . . . . . . . . . . . 374 7 Models of BMASF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

From an ERAE Requirements Specification to a PLUSS Algebraic Specification: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 2 The transit node case study . . . . . . . . . . . . . . . . . . . . . . . "397 3 An overview of the ERAE requirements language . . . . . . . . . . . 398 4 The ERAE specification of the transit node . . . . . . . . . . . . . . 400 5 An overview of the PLUSS specification language . . . . . . . . . . . 409 6 From ERAE to PLUSS . . . . . . . . . . . . . . . . . . . . . . . . . 411 7 The PLUSS specification of the transit node . . . . . . . . . . . . . . 413 8 Conclusion: comparisons between the two specifications . . . . . . . 421

References 422 Appendix . . . . . . . : . . . . . . . . . . . . . . . . . . . . . . . . . 423

In troduc t ion

This part consists of three papers. The papers by Mauw L: Wiedijk and Bidoit et al. discuss the same case study: a so-called transit node. Both papers arrive at an algebraic specification departing from an initial specification by Hagelstein. The paper by Baeten

Bergstra discusses how an algebraic specification might be used in the context of a specification language design.

The papers use different specification formalisms. Baeten L: Bergstra use an informal dialect of ASF extended with notations from the so-called Bird Meertens formalism for functional programming; Mauw & Wiedijk use PSF and Bidoit et al. use PLUSS. This diversity illustrates a point that becomes increasingly clear during the project: it is very difficult to design a single language for equational algebraic specifications that satisfies the needs of users at different sites. In this context it should be noticed that METEOR has put substantial efforts in two more languages based on equational logic: ALGRES and RAP. Indeed efforts towards linguistic unification have led to COLD, a language that incorporates much more expressive features than just those inherited from algebraic (and logic) programming in the conventional sense.

Specification of the Transit Node in PSF d

S. M a u w F. W i e d i j k

University of Amsterdam Programming Research Group

P.O.Box 41882, 1009 DB Amsterdam The Netherlands

abstract The specification language PSF d is used to give a formal specification of a transit node, a common case study in ESPRIT project METEOR. The design of the specification derived from the informal text and the ERAE specification is included. A short discussion on the relation to the specification in ERAE is provided.

1. INTRODUCTION

This paper contains a case study in the formal description technique PSFd. We specify a transit node, which is the common case study for several formalisms in the ESPRIT project METEOR. In [MHB89] the transit node is specified in the algebraic specification language PLUSS. The PSFd specification is derived partially from an informal text and partially from the ERAE specification in [Hag88]. The design of the specification is included, from which a general method can be derived for specifying

similar problems in PSFd. In [MHB89] the transit node is specified in the algebraic specification language PLUSS.

The PSFd specification can be viewed at as a more implementation directed specification than the one in ERAE. Certain design decisions are made, e.g. in identifying the separate objects that act in parallel. Thus the PSFd specification, viewed as an implementation of the ERAE specification must

be verified or validated. A short discussion is devoted to this topic.

2. PSF d

PSFd (Process Specification Formalism - Draft) is a Formal Description Technique developed for

specifying concurrent systems. The formal definition of PSFd can be found in [MV88]. In [MV89] an

introduction to the basic features is given. PSFd has been designed as the base for a set of tools to support ACP (Algebra of Communicating Processes) [BK86]. We use bisimulation semantics to attach a meaning to the specification of processes. The part of PSFd that deals with the description of the data is based on ASF (Algebraic Specification Language) [BHK89]. Here we use initial algebra semantics. PSFd supports the modular construction of specification and parameterization of modules.

342

3. THE TRANSIT NODE

The Transit Node is a case study, which was defined in the RACE project 1046 (SPECS). An informal

description of the Transit Node and the ERAE specification of it can be found in [Hag88]. The

informal specification reads as follows:

"The system to be specified consists of a transit node with: • 1 Control Port-In

• 1 Control Port-Out

• N Data Ports-In

• N Data Ports-Out

• M Routes Through

(The limits of N and M are not specified.) Each port is serialized. All ports are concurrent to all others. The ports should be specified as separate, concurrent entities. Messages arrive from the environment only when a Port- In is abe to treat them. The node is "fair". All messages are equally likely to be treated, when a selection must be made, and all messages will eventually transit the node, or be placed in the collection of faulty messages. Initial State: 1 Control Port-In, 1 Control Port-Out. The Control Port-In accepts and treats the following three messages:

• Add-Data-Port-In-&-Out(n) gives the node knowledge of a new port-in(n) and a new port-out(n). The node commences to accept and treat messages sent to the port-in, as indicated below on Data Port-In.

• Add-Route((m),n(i),n(j),...)) gives the node knowledge of a route associating route m with Data Port- Out(n(i),n(j),...).

• Send-Faults routes all saved faulty messages, if any to Control-Port-Out. The order in which the faulty messages are transmitted is not specified.

A Data Port-In accepts and treats only messages of the type: • Route(m).Data

The Port-In routes the message, unchanged, to any one (non-determinate) of the Data Ports-Out associated with route m. (Note that a Data Port-Out is serialized - the message has to be buffered until the Data Port-Out can process it). The message becomes a faulty message if its transit time through the node (from initial receipt by a Data Port-In to transmission by a Data Port-Out) is greater than a constant time T.

Data Ports-Out and Control Port-Out accept messages of any type and will transmit the message out of the node. Messages may leave the node in any order. All faulty messages are saved until a Send-Faults command message causes them to be routed to Control Port-Out. Faulty messages are messages on the Control Port-In that are not one of the three commands listed, messages on a Data Port-In that indicate an unknown route, or messages whose transit time through the node is greater than T. Messages that exceed the transit time of T become faulty as soon as the time T is exceeded. It is permissible for a faulty message to not be routed to Control Port-Out (because, for example, it has just become faulty, but has not yet been placed in a faulty message collection), but all faulty messages must eventually be sent to Control Port-Out with a succession of Send-Faults commands. It may be assumed that a source of time (time-of-day or a signal each time interval) is available in the environment and need not be modeled with the specification."

343

4. DESIGN OF THE SPECIFICATION

4.1. General

The specification was designed using a mixed top-down and bottom-up approach. It was based on

the informal text, while using the interpretation of the text in the ERAE specification when needed

to fill in omissions or solve ambiguities.

Several' design decisions were made, which did not follow directly from the informal description of

the case study. (e.g. the decision to let the Control Port-in keep control of the table containing all

routes through the node).

4.2. Design

We first identify all parameters of the system, i.e. objects which are -and should be- unspecified.

Since "it may be assumed that a source of time is available in the environment", we postulate the

existence of a process that behaves like a clock. This can be done by making a parameter containing

this clock process. The second parameter is formed by the time that a message may be inside the

node without getting faulty, the maximal transit time. The exact length of this duration should be

decided upon at the implementation phase.

Then we identify all ( c o n ~ r e n t ) components in the system. We have a Control-Port-In, a Control-

Port-Out, a number of Data-Ports-in and a number of Data-Ports-Out. Note that we don't consider the

Routes as components, since these are static objects without temporal behaviour. Because all Data-

Ports-In have the same behaviour, we can specify just one process, indexed with the actual name of

the port. The same holds for the Data-Ports Out.

N o w we make the decision that the routes and the information about the ports that exist are

handled by the Control-Port-In, so this process is indexed with a route-table and with a port-set.

Furthermore we see that the Control-Port-Out must contain a number of faulty messages that

should be flushed and that every Data-Port-Out must contain a number of messages that should be

sent to the environment. So both processes are indexed with a message-bag. The signature of the top-

level objects now looks like:

p r o c e s s e s control-port-in : route-table # port-set control-port-out : message-bag data-port-in : port-name data-port-out : port-name # message-bag

From the informal text and the ERAE specification we can now define the initial state of the the

node. It consists of the concurrent operation of the control-port-in and the control-port-out, indexed

with the empty-route-table, the empty-port-set and the empty-message-bag. Of course we must add

the parameter process clock in parallel and we must abstract f rom the internal actions and

encapsulate unsuccessful communications.

transit-node = hide(I, encaps(H, clock I{ control-port-in(empty-route-table, empty-port-set) ]~ control-port-out (empty-message-bag)))

N o w we can proceed in a bottom up way by defining the data types route-table (an instance of the

paramete r ized modu le table with the data type routes), port-set (sets instantiated wi th ports), message-bag (bags instantiated with messages) and port-name.

344

The top-down approach is continued by defining the behaviour of the four processes, each in a separate module. This leads to the question which objects are connected, in order to communicate to

each other. We see that there is a link between the control-port-in and the control-port-out. Every data-port-in is linked to the control-port-in for route information and to the control-port-out for sending faulty messages. All data-ports-in are connected to all data-ports-out to transmit messages. And finally all ports have a connection to the environment for either accepting or transmitting

messages. As can be seen in the specification, the behaviour of the objects is specified by determining all initial communication actions. Every action is then followed by the corresponding behaviour, e.g. a transmission or a state change. This can possibly be specified by using subprocesses. The control-port-in e.g. can accept one of the following messages:

* add-datum-port(p), followed by the subprocess that handles adding a data-port-in and a data-port-out;

, add-route(r), followed by a state change where the route-table is updated;

* send-faults, followed by forwarding this message to control-port-out;

* request-route(rn), followed by sending appropriate information about the route back.

After having identified all atomic actions (i.e. communication attempts) we can define the communication function and the set of atoms that has to be encapsulated and abstracted.

4.3. Topology of the transit node

We can visualize the structure of the transit node with the following picture.

I

control-input I I control-in-to-out I J control-port-in I ~'~ I control-port-out

i , l--.. Lon

; / ' \ ,,._! aata-port-in(pl) 1 . . . . / . . . . ~ data-port-out(pl) -~

I I data-in~Ib'°ut~pl 'pIT ~ / t

l ~' S S II / %% s

I s

s

figure I The transit node

345

5. THE SPECIFICATION

The specification that resulted from the design as described in the previous paragraph will now be

given. Note that the linear structure of the specification does not comply with the way the

specification was designed. This is because the formalism forces us to write down the specification in a bottom-up way. We first give alt basic data types needed in the specification, then we define the data types specific to

the transit node, then we define all processes involved and finally we give an example of an

instantiation of the dock parameter.

5.1. Basic data types

The basic data types consist of the simple types booleans and natural numbers, and the parameterized

W pes bags, sets and tables. The difference between bags and sets is that in a set duplicates are removed. A table can be used to look up an item corresponding to the value of a certain key.

data module booleans

begin

exports begin

s o r t s BOOL

functions true :

false :

e n d

-> BOOL

-> BOOL

or : BOOL # BOOL -> BOOL

and : BOOL # BOOL -> B00L

variables b : -> BOOL

equations [I] or(true, b) = true

[2] or(false, b) = b

[3] and(true, b) = b

[4] and(false, b) = false

end booleans

data module natural-numbers

begin

exports begin

sorts nat

funotion8 0 : -> nat

s : nat -> nat

eq : nat # nat -> BOOL

It : nat # nat -> BOOL

+ " nat # nat -> nat

- : nat # nat -> nat

end

is%~ort 8 booleans

varlables

n, nl, n2 : -> nat

equations

[i] eq(0, 0)

[2] eq(0, s(n))

[3] eq(s (n), 0)

[4] eq(s (nl) , s (n2))

[5] It(0, s(n))

[6] it (n, 0)

[7] it (s (nl), s (n2))

[8] n + 0

[9] nl + s(n2)

[i0] 0 - n

[ii] n - 0

[12] s(nl) - s(n2)

end natural-numbers

346

= true

= false

= false

= eq(nl, n2)

= true

= false

= lt(nl, n2)

= n

= s(nl + n2)

= 0

= n

= nl - n2

data module bags

begin

parameters items

begin

sorts item

end items

exports

begin

sorts bag

functions

empty-bag : -> bag

add : item # bag -> bag

end

variables

il, i2 : -> item

b : -> bag

equations

[i] add(il, add(S2, b)) = add(S2, add(il, b))

end bags

data module set

begin

parameters

equality began

fun=tions

eq : item # item -> BOOL

end equality

e x p o r t s b e g i n

functions

eq ~ : set # set -> BOOL element : item # set -> BOOL

end

347

imports

bags

{ renamed by [ bag -> set,

empty-bag -> empty-set] },

booleans variables

i, il, i2 : -> item s • -> set

equations

[i] add(i, add(i, 3)) [2] element (i, empty-set) [3]

= add(i, s) = false

element(il, add(J2, s)) = or(eq(il, i2), element(il, s))

end set

data module tables

begin

parametezs items

begin

sorts key, value funotions

eq : key # key -> BOOL default-value : -> value

end items

exports

begin

sorts table

functions empty-table : -> table

add : key # value # table -> table look-up : key # table -> value

end

imports booleans

variables k, kl, k2 : -> key

v • -> value t : -> table

equations

[i] look-up(k, empty-table) = default-value

[2] look-up(kl, add(k2, v, t)) = if(eq(kl, k2), v, look-up(kl, t))

end tables

348

5.2. Data types specific to the transit node

The module time suppUes functions to deal with timing information. To the outside the so~ time built up from the constant init~l-time, using the +-function to add durations. A duration ~ either the constant tick-duration, or the difference of two times. Internaly we use the naturals and auxiliary functions to define the exposed hm~ions.

data modu le time

begin

expo=ta

begin eoz~a time, duration

fun=~ione initial-time : -> time

tick-duration : -> duration

it : duration # duration -> BOOL

+ : time # duration -> time

: time # time -> duration

end

ia~pO~'tS natural-numbers

funationa time : nat -> time

duration : nat -> duration

vnzi~lee

nl, n2 : -> nat

eqn&tiona

[I] initial-time

[2] tick-duration

[3] [4] [5]

-- time (0)

= duration(s (0))

It(duration(nl), duration(n2)) = it(nl, n2)

time(nl) + duration(n2) = time(nl + n2)

time(el) - time(n2) = duration(nl - n2)

e n d time

The type of information that can be transmitted through the transit node is defined in the module

datum.

data module datum

b e g i n

expo=te b e g i n

s o z t s datum e n d

i~poz~e natural-numbers

funotiona datum : nat -> datum

end datum

349

The t ransi t nodes contains a n u m b e r of por ts for inpu t an d output . These por ts are n a m e d wi th

na tura l numbers . Port names can be collected into sets b y b ind ing the parameter of the basic modu le

set to port-name.

data~odule port-name begin

expo=ta begin

8oz~s port-name

funotion8 eq : port-name # port-name -> BOOL

end

imports natural-numbers funation8

port-name : nat -> port-name

variables nl, n2 : -> nat

equations [I] eq(port-name(nl), port-name(n2)) = eq(nl, n2)

end port-name

data module port-sets begin

i u ~ o r t e s e t

{ ~ensmed b y [ set -> port-set,

empty-set -> empty-port-set ] items bound by

[ item -> port-name ] to port-name

equality bound by [ eq -> eq ] to port-name

end port-sets

A route consists of a route-name a n d a set of o u t p u t por ts associated w i th this route. Routes are

collected into tables in o rder to look u p the por t -se t cor responding to the n a m e of a previous ly

created route.

date su~dule route-names begin


e o z t 8 r otl t e -nanl~

£ u n o t i o n a eq : route-name # route-name -> BOOL

end

350

impozts natural-numbers funotions

route-name : nat -> route-name

vaziables nl, n2 : -> nat

equations [1] eq(route-name(nl), route-name(n2)) = eq(nl, n2)

end route-names

data module routes begin

expozts begin

sozts route fun=tions

route : route-name # port-set -> route name-of : route -> route-name ports-of : route -> port-set eq : route # route -> BOOL

end

impo~s booleans, port-sets, route-names

vaEiables nl, n2 : -> route-name

psl, ps2 : -> port-set equations

[I] name-of(route(nl, psl)) = nl

[2] ports-of(route(n1, psl)) = psl [3] eq(route(nl, psl), route(n2, ps2)) = and(eq(nl, n2), eq(psl, ps2))

end routes

data module route-tables

begin

imports tables

{ =enamed by [ table -> route-table,

empty-table -> empty-route-t able I

items bound by [ key -> route-name, value -> port-set,

eq -> eq, default-value -> empty-port-set]

to routes}

end route-tables

351

If components communicate to the outside world or to each other, messages are exchanged. Most of the messages are indexed with a value of some data type. Messages can be collected in bags.

data module messages

begin

e x p o z t s b e g i n

sorts message

funotions add-datum-port : port-name -> message

add-route : route -> message

send-faults : -> message

routed-datum : route-name # datum -> message

req-route : route-name -> message

available-ports : port-set -> message

timed-message : time # datum -> message

datum : datum -> message

en d

impO~'t8 datum, time, port-name, routes

end messages

data modnle message-bags

begin imports

bags

{ menamed by

[ bag -> message-bag,

empty-bag -> empty-message-bag ]

items bound by [ item -> message ]

t o messages }

end message-bags

The various components of the transit node are connected to each other with channels. There are also channels to the environment.

data module channels

b e g i n


sort~e channel

funotione control-input

control-output

control-in-to-out :

control-to-data : port-name

data-to-control : port-name

rejection

e n d

-> channel

-> channel

-> channel

-> channel

-> channel

-> channel

data-in-to-out : port-name # port-name -> channel

data-input : port-name -> channel data-output : port-name -> channel

impoEt s port-name

end channels

352

5.3. The processes

5.3.1. Communicat ion The module communication defines the atomic actions that can be executed by the various components, when trying to communicate. The communicat ion function is def ined such that a read action (r) and a send action (s) can be combined into a communicat ion action (c).

These actions are indexed with the channel used to communicate and the message to be transmitted.

In the same way timing information can be communicated. The set of internal actions (I) and the set of actions to be encapsulated in order to get only successful

communication (H) are also defined.

proaeas module communication ~egin

a x p o r t m begin

atoms r

s

c

read-time s e n d - t i m e comm-time

: channel # message : channel # message : channel # message : time

: time

: time

sets o f a t o m s I = { c(c, m), corns-time(t) [

t in time, c in internal-channels, m in message }

H = { r(c, m), s(c, m), send-time(t), read-time(t)

t in time, c in internal-channels, m in message }

end

imports channels, messages, time

lets of channel internal-channels =

{ control-in-to-out, rejection, data-to-control(pnl), control-to-data(pnl),

data-in-to-out(phi, pn2) I pnl in port-name, pn2 in port-name }

communications r(c, m) [ s(c, m) = c(c, m)

for c i~ channel, m in message read-time(t) { send-time(t) = co~1-time(t)

for t in time

end communication

353

5.3.2. Data-ports-in For every port-name a process data-port-in is defined. Every data-port-in behaves as follows. First it reads from its input channel the message to send some datum along some route.

Then it reads the current time and asks the control-port-in for the port set attached to the requested

route. Then a transit attempt is made. If the route-name was faulty, an empty-port-set was returned

and the incoming message is routed to the rejection channel, thus becoming faulty. If the port-set was not empty, one port is selected randomly and after adding a time stamp the incoming message is routed to that port. The process transit-datum is not defined in case the port-set is empty. This means that it equals deadlock.

pzocemm module data-ports-in began

e x p o r t s began

~ o c Q a e e 8 data-port-in : port-name

end

Ampozts port-sets, route-names, time, communicat ion

p E o o l e e e m transit-attempt : port-set # port-name # time # route-name # datum transit-datum : port-set # port-name # time # datum

vaziables tl, t2 : -> time pl, p2 : -> port-name rn : -> route-name ps : -> port-set d : -> datum

dlfinAtions data-port-in(p1) ~ sum(d in datum, sum(rn in route-name,

r(data-input(pl), routed-datum(rn, d)) . s~m(tl An time, read-time(tl) . s(data-to-control(pl), req-route(rn)) mmn(ps in port-set, r (control-to-data (pl), available-ports (ps))

transit-attempt(ps, pl, tl, rn, d) data-port-in (pl)) ) ) )

transit-attempt(empty-port-set, pl, tl, rn, d) - s(rejection, routed-datum(rn, d))

transit-attempt(add(p2, ps), pl, tl, rn, d) = transit-datum(add(p2, ps), pl, tl, d)

transit-datum(add(p2, ps), p1, tl, d) = s (data-in-to-out (p1, p2), timed-message(tl, d) ) + transit-datum(ps, pl, tl, d)

end data-ports-in

354

5.3.3. Data-ports-out The following module is parameterized with a duration, max-transit-time, that determines the maximum time a message may stay within the transit node.

For every port-name a process data-port-out is defined. Every data-port-out is indexed with a bag of messages that must be sent to the environment. Initially this bag is empty. It starts by reading a timed message from one of the data-input-ports. This message is added to the bag and the process starts again. If the bag is not empty, the process also has the possibility to output some message from the bag. If the max-transit-time is expired, then the message becomes faulty and w~l be sent to the rejection channel. Otherwise, the message is sent to the environment.

process module data-ports-out begin

parameters max-transit-time

begin fun=tion8

max-transit-time : -> duration end max-transit-time

exports begin

~ r o ~ e s s e s d a t a - p o r t - o u t : p o r t - n a m e # m e s s a g e - b a g

e n d

importm port-name, message-bags, communication

~ r o ~ e s s e 8 handle-message-out : BOOL # datum # port-name

variables t, tl, t2 : -> time pl, p2 : -> port-name mb : -> message-bag d, e : -> datum

definitions data-port-out (p2, empty-message-bag) =

sum(p1 in port-name, mum(t1 in time, aum(d in datum, r(data-in-to-out(pl, p2), timed-message(t1, d)) . data-port-out(p2, add(timed-message(tl, d), empty-message-bag)))))

data-port-out(p2, add(timed-message(t2, e), mb)) = sum(p1 in port-name, 8um(tl in time, sum(d in datum,

r (data-in-to-out (pl, p2) , timed-message (tl, d) ) . data-port-out (p2,

add(timed-message(tl, d), add(timed-message(t2, e), mb)))))) + 8um(t in time, read-time(t) .

handle-message-out(it(t - t2, max-transit-time), e, p2) . data-port-out (p2, mb))

handle-message-out(false, d, p2) s(rejection, datum(d))

handle-message-out(true, d, p2) = s(data-output(p2), datum(d))

end data-ports-out

355

5.3.4. Control -porbin The process control-port-in keeps track of ~ l defined routes and all exmting

ports, so ~ is indexed with a route-table and a port-set. ~ is connected to the environment with the

control-input channel. Via this channel it can receive the message to add a datum-port, to add a

r o u t , or to flush ~1 faulty messages. As a la~ option it can receive a request from some da~-porNin to send the routing information belonging to some route-name.. AU these incoming messages are

treated separately. The request to add a datum port is handled using a subprocess. This handler checks wether the data port a~eady exists. Then it ~ther r ~ e ~ s the message or adds the port to the port~et and creates two n e w parallel processes: a data-port-in and a data-portwut.

a request is made to add a r o u t , it s imply adds the route informafion to the route-set. A send- faults request is s imply passed on to the control-port-out. A request for route information is answered by looking up the reques~d information and sending ~ back.

process module control-port-in begin

exports begin

prooesses control-port-in : route-table # port-set

end

imports route-tables, communication,

data-ports-in,

data-ports-out

prooesses

handle-add-port : route-table # port-set # port-name # BOOL

variables p : -> port-name

rt : -> route-table ps : -> port-set

definitions control-port-in(rt, ps) =

sum(p in port-name, r(ccntrol-input, add-datum-port(p)) . handle-add-port(rt, ps, p, element(p, ps)))

+ sum(r in route, r(control-input, add-route(r))

control-port-in(add(name-of(r), ports-of(r), rt), ps))

+ r(control-input, send-faults) .

s(control-in-to-out, send-faults) . control-port-in(rt, ps)

+ sum(p in port-name, sum(rn in route-name, r(data-to-control(p), req-route(rn)) . s(control-to-data(p), available-ports(look-up(rn, rt))))) control-port-in (rt, ps)

handle-add-port(rt, ps, p, true) =

s(rejection, add-datum-port(p)) . control-port-in(rt, ps)

handle-add-port(rt, ps, p, false) = control-port-in(rt, add(p, ps)) II

data-port-in(p) ~I data-port-out(p, empty-message-bag)

end control-port-in

356

5.3.5. Control-port-out The process control-port-out is indexed with the message-bag containing all faulty messages. It has a simple behaviour. It can receive the message to send aU faulty messages to the environment, which is handled by the subprocess flush, or it can receive faulty message via the rejection channel.

p z o o o e e modulo control-port-out begin


p r o o e e e e 8 c o n t r o l - p o r t - o u t : m e s s a g e - b a g

e n d

imports message-bags, communication

proceaeee flush : message-bag

variables

m : -> message mb : -> message-bag

doEinitionm control-port-out (mb) =

r(control-in-to-out, send-faults) . flush(mb)

+ mum(m in message, r(rejection, m) . control-port-out (add(m, mb) ) )

flush(empty-message-bag) = control-port-out(empty-message-bag) flush(add(m, mb)) = s(control-output, m) . flush(mb)

end control-port-out

5.3.6. Transit-node Finally the transit node is specified by the concurrent operation of the clock process, which is a parameter of the system, the control-port-in and the control-port-out. These ports

are initialized with an empty table, set and bag. In order to hide internal actions and to get only successful communication, we add the hiding operator and the encapsulation operator. Note that apart from the parameter clock, we also inherit the parameter max-transit-time from the

imported module data-ports-out.

prooess modulo transit-node

begin

p . ~ e m e t e ~ e time

begin prooQeeo8

clock end time

357


p=ocommes transit-node

end

impoEt8 control-port-in,

control-port-out

definition8 transit-node = hide (I, encap8 (H,

clock I[

cont rol-port -in (empty-route-t able, empty-port-set)

control-port-out (empty-message-bag)) )

end transit-node

II

5.4. Example of a clock

In this section w e give an example of h o w the clock parameter of the transit node can be initialized.

The process clock starts at the initial-time. Then it can do a tick, foUowed by an increment of the

current t ime with a tick-duration, or it can send the t ime to anyone wi l l ing to read it. Note that in

this version of a clock the action of sending the t ime wiI1 not cost any time.

p~ooem8 module a-clock

b e g i n

e x p o z t m b e g i n

pEooe88e8

clock

end

i m p o r t s t ime, con~unicat ion

atoms tick

~ r o ~ o a a e B clock : time

variables

t : -> time

definitions clock = clock (initial-time)

clock(t) = tick . clock(t + tick-duration) +

send-time (t) . clock (t)

end a-clock

358

proQess module transit-node-with-a-clock begin

imports transit-node

{time bound by [clock -> clock]

to a-clock}

end transit-node-with-a-clock

5.5. Graphical representation of the import relation

Using the IDEAS tool deve loped wi th in the METEOR project [Ide88] we can give the fol lowing

picture (see figure 2), represent ing the impor t relat ion be tween all modules of the specification of

the transit node. Rectangular boxes are used for data modules and boxes wi th rounded corners are

used for process modules . An ar row from a module to another modu le means that the former is

impor ted into the latter. Note that not all textual imports are present in the picture. We used a tool

to compute the minimal impor t relat ion hav ing the same transit ive closure as the textual one.

6. RELATION TO THE ERAE SPECIFICATION

In this section we will give a brief discussion of the relation be tween the ERAE specification and the

PSFd specification of the transi t node. It is clear that , since ERAE was des igned for requi rements

specification, the first one is closer to the textual specification, whereas in the second one some

design decisions had to be made. As an example look at the rout ing informat ion tha t is t reated as a

separate ent i ty in ERAE, while in PSFd it is part of the state of the control-port-in. The ERAE language is based on temporal logic. Its formal semantics can be found in [HR89], and

[DHR88] contains an introduct ion to the use of ERAE.

In order to validate that a PSFd specification is correct wi th respect to an ERAE specification, a formal

t rea tment of this not ion of va l ida t ion would be needed. Since this pape r does not focus on this

subject, we only give some informal reasoning about the relat ion be tween the two specifications.

The val idat ion is made up of two parts. First we mus t give a relat ion be tween the entities declared

in the ERAE specification and the ones declared in the PSFd specification, and then we mus t provide

an interpreta t ion of the temporal s tatements in ERAE into PSFd.

6.1. Entities

A quick inspect ion learns that , apar t f rom some design decisions an d detail implementa t ions , the

enti t ies in ERAE relate to the entit ies in PSFd hav ing the same name. So whe re ERAE contains

messages like Add-route msgs indexed wi th a route nr an d a series of out port-nr, PSFd has a data

type messages, conta in ing a funct ion add-route, indexed wi th route which is a combina t ion of a

route-name and a port-set. As an other example look at the enti ty Data port-in which is indexed wi th a nr, and is able to receive

Data msgs via a port. In PSF d this translates to a process data-port-in, indexed w i th a port-name, hav ing a channel to the env i ronment called data-input, via which it can receive a routed-datum.

359

natural-numbers.asf

port-name.asf

I route-names.asf [

I route-tables.asf

data-ports-in.psf ~ ( data-ports-out.

~11~ cOntr°l'p°rt'in'psf ) C control-port-out.psf" ~

figure 2 The import relation

360

6.2. Temporal statements

Naively speaking the interpretation of a temporal statement in ERAE into PSFd consists of an interpretation of all events involved into atomic actions, followed by a verification that every possible trace of the specification in PSF d satisfies all temporal statements about events given in the

ERAE specification. Unfortunately this approach is too simple since not only temporal information is involved but also information about the state space of the system.

As an example of how to informally validate the PSFd specification, we will give some ERAE statements and their informal interpretation in the PSFd specification.

initially ~ -~3 dpi: is-in(dpi, Data-ports-in)

^ ~3 dpo: is-in(dpo, Data-ports-out)

^ -~r: is-in(r, Routes)

^ -~3 wm, dm: faulty(win) v faulty(rim)

This can be interpreted as the statement that there are no data ports in the definition of the process

transit-node, and that the port-set, route-table and (faulty) message-bag are empty:

transit-node = hide(I, enQaps(H, clock I I control-port-in (empty-route-table, empty-port-set) I I control-port-out (empty-message-bag)) )

A number of statements are about the behaviour of the environment of the transit node. These

statements are not explicitly met by the PSFd specification, since it only specifies the behaviour of the

transit node without restricting its environment. As an example look at the statement

occurs (d/a) ~ • exists (port (dm))

which states that messages only arrive at existing input ports (the symbol • means "true in the

previous state"). This assumption about the environment is not stated in the PSFd spedfication.

As a last example look at the statement about state changes concerning data-ports-in:

exists(dpi) n • ~ exists (dpi)

3 apm: occurs(apm) A nr(dpi)=port-nr(apm)

This states that if a data-port-in is created, an add-port-message must have been occurred. In the PSFd

specification this is verified by looking at all places where a data-port-in is created. This can only

happen in the subprocess handle-add-port of the process control-port-in. This subprocess is only

invoked after the atomic action c(control-input, add-datum-port(p)) has occurred for some

appropriate port-name p. It is clear that this reasoning is very informal. This is because the existence of a data-port-in is easy to check at the textual level of the specification, but not at the level of the semantics of PSPd. The semantics is a labeled transition graph, which in no way contains information about the number of processes that it is constructed from, but only about the actions that can be performed by the system. Also the actual value of the indexes of the processes involved is not part of the semantics.

361

7. DISCUSSION

Since some design decisions were needed, the specification of the transit node in PSFd is more

specific than the specification in ERAE. There is no easy transformation from an ERAE specification

to a PSFd specification, however when having an ERAE specification, the informal text can be interpreted more easily.

We can only give an informal validation of the PSFd specification when relating it to the ERAE

specification. This is due to the fact that in some cases ERAE statements relate to the state of the system, which is not part of the formal semantics of PSF d. We can however look at the textual level

of the specification and give an informal reasoning. Also restrictions to the environment can not be expressed in PSFd. The design of the specification can be generalized to the following method:

• Identify the parameters of the system. • Identify all concurrent components.

• Add indexes to the process names of each component to keep track of state information and to create more instances of the object.

• Define the abstract data types needed for these indexes. • Specify how the components are connected. • Define the initial state of the system.

• Define the behaviour of each component.

Of course the last step of this method can be very involved. Each component in turn can then be

divided into subcomponents, in such a way that the method recursively applies to these subcomponents.

As a conclusion we can state that PSFd is well suited for the specification of concurrent systems.

8. ACKNOWLEDGEMENTS

We like to thank Jos Baeten and Henrik Jacobsson for proof reading this document and the specification and Hans Mulder for technical advice.

9. REFERENCES

[BHK89] J.A. Bergstra, ]. Heermg & P. Klint, Algebraic specification, ACM Press Frontier Series, Addison Wesley, 1989.

[BK86] J.A. Bergstra & J.W. Klop, Process algebra: specification and verification in bisimulation semantics,, in: Math. & Comp. Sci. II, (M. Hazewinkel, J.K. Lenstra & L.G.L.T. Meertens, eds.), CWI Monograph 4, pp 61-94, North-Holland, Amsterdam, 1986.

[DHR88] E. Dubois, J. Hagelstein & A. Rifaut, Formal requirements engineering with ERAE, Phih'ps Journal of Research 43, nos. 3/4, pp. 393-414, 1988.

[Hag88] J. Hagelstein, The Transit Node - ERAE specification, METEOR PRLB Report, 1988.

[HR89] J. Hagelstein & A. Rifaut, The semantics of ERAE, Philips Research Laboratory Brussels Manuscript, Belgium, 1989.

[Ide88] IDEAS interface user gu/de, Centre de Recherches de la C.G.E., Marcoussis 1988.

[MHB89] A. Mauboussin, J. Hagelstein, M. Bidoit, M-C. Gaudel & H. Perdrix, From an ERAE requirement specification to a PLUSS algebraic specification: A case study, Report METEOR task 10, 1989.

S. Mauw & G.J. Veltink, A Process Specification'Formalism, Report P8814, University of Amsterdam, Amsterdam, 1988. To appear in: Fundamenta Informaticae.

S. Mauw & G.J. Veltink, An introduction to PSFd, in: Proc. TAPSOFT 89 (J. Diaz & F. Orejas, eds.), LNCS 352, Volume 2, pp 272-285, Springer Verlag, 1989.

[MV~]

[MV89]

Design of a Specification Language by Abstract Syntax Engineering 1

J.C.M. Baeten Dept. of Software Technology, Centre for Mathematics and Computer Science

P.O.Box 4079, 1009 AB Amsterdam, The Netherlands

J,A. Bergstra Programming Research Group, University of Amsterdam P.O.Box 41882, 1009 DB Amsterdam, The Netherlands Department of Philosophy, State University of Utrecht Heidelberglaan 2, 3584 CS Utrecht, The Netherlands

In this paper, we design a specification language in an entirely algebraic style. We describe the language in terms of abstract syntax only. We argue that this is the correct approach in language design.

1980 Mathematics Subject Classification (1985 revision): 68Q45, 68Q55, 68Q65, 68Q50. 1987CR Categories: F.4.3, D.2.10, D.3.1, D.3.3. Key words & Phrases: abstract syntax, specification languages, module algebra, ASF. Note: This work is partially sponsored by ESPRIT contract 432, An Integrated Formal Approach to Industrial Software Development (METEOR).

1. INTRODUCTION

This report contains a first step towards a design of a specification language like ASF [BHK

89] in an entirely algebraic style. The language BMASF that is the result of this design effort

is simpler than ASF because it has no parametrization mechanism. It is better in the sense that

imports and exports have been worked out in a more satisfactory way. We will first try to

motivate in detail the reasons for the approach we have taken and then work out the language

design. It should be noticed in advance that the language BMASF as such has no pretensions

and that it is the method of its design via abstract syntax which is the real objective of the pa-

per.

In order to clarify the motivation for our work we start by listing some points of view that

we have developed about the design of specification languages. After observing several at-

tempts to define specification languages it has become clear (to us) that the following phe-

nomena seem to be unavoidable:

1 Note: this paper is a revision of [BB 89]. Parts of it have been used in [SPECS 89].

364

(i) Above a certain complexity one runs into semantic trouble because of unexpected in-

teractions of features that are combined in the language being designed. These semantic

problems are almost never solved by theoretical work because they depend on very special

peculiarities of the features as they are embedded in the language.

For instance, parameter passing mechanisms for abstract data types allow many degrees

of freedom that have not been addressed by theoretical work and in which wrong language

designs cannot yet be singled out by pointing at the way in which they depart from known

theory. Also, if information hiding is present, the interaction between parametrization and

hiding becomes complicated. If on top of that, operational aspects enter the scene, one has to

worry about the interaction between information hiding and the abstraction mechanisms of

process theory.

(ii) If the language is designed in a syntax readable for human beings, there is always a

next step in which an abstract syntax has to be designed. Typically, in the design of abstract

syntax one attempts to define a more clear cut language with fewer semantic problems. Usu-

ally, this will not quite succeed because the abstract syntax has to be derived from the given

concrete syntax and that will induce the introduction of features that are not really primitive

o n e s .

(iii) Notoriously, one will feel a need to redesign parts of the language within two years

after its conception. The difficulty is then to have the tools for the language written in such a

way that their code or at least their design is somehow reusable. It is very unpleasant to con-

struct tools for a language that has already been declared outdated by its chief designer. But

exactly this phenomenon occurs time and again.

(iv) Upward compatibility is the slogan that should help in having previous work on a

language reusable. Disappointingly, it is very difficult to design languages in an upwardly

compatible way. More often than not, the redesign of a language will shed light on how to

improve its existing version without adding new features. Moreover, these improvements

may be needed if a second version of the language is to incorporate quite complex new fea-

tures.

(v) The main difficulties are caused by the fact that a readable syntax for a language needs

to be provided with efficient declaration, type inference and type checking schemes, because

otherwise the human reader will soon loose his/her grasp of a piece of syntax due to enor-

mous redundancy. (For computers this problem is much less pressing.) Exactly these

mechanisms are closely connected with the design of the concrete vertical syntax and the par-

tieular packaging of the features that is employed. Those particularities are however quite

hard to keep alive when a next version of the language is made. By the way, these problems

would be less pressing ff the ambition to provide a human readable textual syntax was given

up in favour of a graphical or object-oriented way of working.

365

(vi) There is no place in the software engineering lifecycle for small scope specification

languages (thematic languages and combination l~guages in the terminology of [BR 89]) that

cannot be extended to wide spectrum languages. One will never tolerate the enormous over-

head of recoding a formal specification. It should be noticed that adapting a complex specifi-

cation to a formalism with slightly different mechanisms for import, export and parametriza-

tion is fairly unpleasant. It follows that there is little reason to invest in small languages that

have their key features designed in such a way that extension to larger languages is hardly

possible, or obviously unrewarding. Thus it follows that in particular efforts aimed at the de-

sign of small and specialized specification languages should ensure that all features are incor-

porated in a generic (extendible) way.

To consider an example, ASF [BHK 89] is a language that has some of its key features

worked out in a not entirely satisfactory way: in particular the normalization mechanism is

such that normalization must be done inside out, since other strategies may lead to essentially

different normal forms, and besides this, the information hiding mechanism may only be ap-

plied to fiat specifications, which is a rather pointless restriction. We view ASF as a step in a

bootstrapping process. Its f'trst use should be to assist in the design of much better languages

of its own kind.

2. ABSTRACT SYNTAX DESIGN

The style of designing specification languages that is investigated in this report is to design

abstract syntax only. The following working hypotheses (2.1 - 2.5) underlie the strategy. Of

course validation of these assumptions is a difficult matter that requires substantial experi-

mentation.

2.1 Abstract syntax can be represented in the format of algebraic specifications using many

sorted algebra with total functions. At the level of abstract syntax, the issues of type check-

ing, type inference, declarations and the use of declarations for type inference are totally ab-

sent.

2.2 AU semantic problems of a language should be dealt with at the level of its abstract syn-

tax. Language features that do not allow a coding in an abstract syntax are to be avoided.

2.3 Abstract syntax can be designed in an (almost) upwardly compatible way. If an abstract

syntax specification is to be extended with new features, it almost never is a matter of just

adding additional sorts functions and equations. Nevertheless, the modifications may well be

very limited.

366

A typical example is that one needs additional structure in a name space. This will require

a small redesign of the use of the names. But the general structure of an abstract syntax de-

scription will not be affected by that.

2.4 Abstract syntax can be described in such a way that all equations which describe the se-

mantics of its key ingredients can be added without risking inconsistency of the full specifi-

cation. Thus when designing abstract syntax, constructions must be avoided that are incon-

sistent with key semantic identities.

For instance, having an abstract syntax depend on a function that determines the length of

an expression will usually prevent one to impose any non-trivial identities on the sort of those

expressions. This is useful only if no such identifications are to be expected during the se-

mantic analysis. On the other hand, when tooling a language one will use algorithms that act

on the free term algebra of the abstract syntax and often these terms will have to be repre-

sented in a form that allows efficient manipulations (e.g. by the use of pointers).

2.5 An increasing family of languages can be designed at the level of abstract syntax. It is

possible to analyze features at an abstract level in such a way that one can be confident that

language extension will not lead to an entirely different view of the features. In order to guar-

antee this, it is crucial that many semantic equations can be imposed on the abstract syntax.

Indeed it should be noticed that all generic constructions should be representation independent

as much as possible. This is best guarantied by having them consistent with a semantic model

that identifies many systems. This implies that one will search for the most abstract semantics

that is available.

This is a very tricky area because the more features one introduces, the less abstract the

semantic model can be! In the case of process descriptions, this is a reason to use bisimula-

tion semantics rather than less discriminating semantics, such as trace semantics. Features

like interrupts, fair abstraction, deadlock analysis and structured operational semantics are

harder if not impossible to model in trace semantics. It is also a reason to have only limited

interest in fully abstract models. If later on, new features are added to a language, a fully ab-

stract model may suddenly be ineonsistent with the novel feature. This can happen to any

other model as well, of course, but aiming at full abstraction clearly maximizes the risk.

3, AN OPEN PROBLEM ABOUT THE METHODOLOGY OF LANGUAGE DESIGN

Let us imagine a project that is carried out as follows:

(A) Design a family of specification languages with increasing expressive power and

complexity at the level of abstract syntax and reuse almost all of each language description in

the design of the next language.

367

03) Generate in each stage an elementary tool set lETS) consisting of a concrete textual

(vertical, structured) syntax together with the following: a parser, a type checker, a prototyp-

ing tool, an interactive editor, a connection with a software engineering data base and a ver-

sion management system, a cross reference information generator, and an automatic transla-

tion to and from earlier stages of the language, as well as an interface with specialized support

programs for debugging, verification, proof editing, proof checking, graphical support and

object-oriented representation.

(C) Have the overheads on (B) above so small that it is possible to work in (A) with

many small steps rather than with a few big steps.

Now the question is whether or not such an approach is feasible. We have no opinion in ad-

vance what is the answer to this methodological question. Our point of view is that it is worth

trying for the sake of a research project and that its practical value must be determined later on

by other people. Thus the open problem is turned into a working hypothesis without any

claim as to its validity.

Clearly our research can never generate a negative answer to the methodological question

because inability to carry out the project proves nothing about its feasibility. By being sue-

eessful, it can at best generate positive evidence. We will try to catch the motivation of the

work in a few key phrases that may be useful in these times of inflation of terminology.

3.1 ABSTRACT SYNTAX ENGINEERING AND THE ABSTRACT SYNTAX ENGINEERING HY-

POTHESIS

Abstract syntax engineering is the incremental design of languages via their abstract syntax.

The hypothesis claims that this style of working is economically more efficient than the con-

ventional design via BNF grammars.

3.2 ALGEBRAIC SPECIFICATION HYPOTHESIS (FOR ABSTRACT SYNTAX ENGINEERING)

This hypothesis claims that one gets benefit from the use of algebraic specifications in the

ease of abstract syntax engineering. In particular the restriction to many sorted algebra with

finitely many sorts and total functions is supposed useful.

It may even be useful to structure the signature of the algebra itself as a finite partial alge-

bra. Infinite signatures are to be avoided and must be counted as an indication of limited suc-

cess in abstract syntax design (of course infinite signatures may play an important role in in-

termediate stages of a design).

3.3 MEANING IS A CONGRUENCE ON ABSTRACT SYNTAX

This claim is not exactly new. It was analysed in depth in [J 89] and named Frege's principle.

The difficulty is to adhere to this slogan when the abstract syntax gets more complex. In the

868

context of the abstract syntax engineering hypothesis, this means that the abstract syntax must

be made compatible with most (and preferably all) semantic identities that come about during

semantic analysis of fragments of the language.

3.4 THE HYPOTHESIS OF ABSENCE OF CANONICAL MEANING

From a certain complexity onwards, there is no canonical model for the semantics of an ab-

stract syntax. In particular, there is no such thing as the 'real' practical meaning that users

have in their mind and that theorists fail to write down in a concise way, due to an over-em-

phasis of their mathematical standards and rigour. On the contrary, the practical user has of-

ten only an intuitive semantic view on fragments of an abstract language, and simply ignores

the question how to integrate these fragments into a consistent picture.

In the paper [BHK 88] on moduIe algebra, it has been emphasized that already a very few

constructors for structured algebraic specifications generate a setting for which different use-

ful semantic models can be found and the selection of a single most convincing model seems

to be impossible. There is no indication that the practicioner's mind contains hidden semantic

information that would resolve the semantic ambiguity of module algebra.

The practical consequence is that abstract syntax must preferably be designed in such a

way that the full spectrum of semantic models known for fragments of the language can still

be captured by considering the right congruence on the abstract syntax.

3.5 THE ABSTRACT SYNTAX DESIGN RULE: OPTIMIZATION OF LOOSE SEMANTICS IN ALGE-

BRAIC ABSTRACT SYNTAX ENGINEERING

This rule makes explicit the consequences of the hypothesis of absence of abstract syntax. It

emphasizes that while designing an abstract syntax one should not optimize the fit with a

given semantic model but rather ensure that a maximum of semantic options is left open. The

algebraic specification of the abstract syntax will usually have a loose semantics in the sense

that its initial algebra is not at all the model with the deepest semantic pretensions. Rather a

family of models should exist that reflect the different ways in which fragments of the lan-

guage can be provided with meaningful semantic models.

3.6 INITIAL OVERSPECIFICATtON IS UNAVOIDABLE AND EVEN BENEFICIAL

This final point lies at the heart of the algebraic approach. Once an algebraic specification of

abstract syntax has been manufactured, the unavoidable question is: why have just these

equations been selected and is this not a totally arbitrary choice? Moreover, the axioms may

not allow semantics options that are popular in modern research because the equations iden-

tify too much.

The answer to this is that these problems are unavoidable in the algebraic method and that

designers who dislike these uncertainties should not employ these techniques. Let us consider

369

the case of group theory as an example. Once the axioms of groups have been written down,

a substantial amount of useful theory can be generated. After some time, Abelian groups be-

come important and this is no problem because the group axioms are consistent with the ad-

ditional axiom of commutativity. Still later however, the success of group theory is such that

one starts investigating semi-groups. This is a very different matter because an entire opera-

tion has been left out and the specification has been essentially weakened. Suppose that semi-

groups after all are the really useful concept and the groups are a subcase of less interest.

Then from a methodological point of view, the specification of groups formed an initial over-

specification of the semi-groups. How much damage has this eansed? We claim that no dam-

age can be observed at all. On the contrary, the initial overspecification of a concept (semi-

groups) that later on was found to be very useful has strongly guided the intuition on how to

get the appropriate rewards from the algebraic model of the mechanisms involved. Even if

semi-groups are the real thing after all, one earl never ensure that future generations will not

make big progress in the field of 'non-associative semi-groups'!

All of this boils down to the point of view that there will never be a truly undisputed set

of axioms for any given signature, even if one has an agreement on the intuitions that the op-

erators of the signature must support. This is true for every feld of mathematics, including

geometry and analysis, and it would be very optimistic to suppose that discrete systems the-

ory constitutes an exception, just because these discrete systems are man made.

4. ON THE ROLE OF SOFI'WARE SUPPORT FOR ALGEBRAIC SPECIFICATIONS IN

ABSTRACT SYNTAX ENGINEERING

The fn'st role of software support is simply that with abstract syntax specifications becoming

larger, type checking becomes useful. Although we have not been using the ASF system as

described in [BI-IK 89], the type checking facilities of such a system will be needed to avoid

mistakes in the specifications. We will discuss which further use a specification in a language

as ASF might have.

Secondly, one might specify a normalization algorithm for structured specifications in

BMASF using equations. If the resulting rewrite system is complete it can be prototyped by

means of the ASF system. This implies that a non-trivial transformation has to be trans-

formed into an executable (complete) TRS. Given a vertical syntax, one may specify decision

algorithms that decide whether a specification satisfies certain design rules (e.g. not contain-

ing type-incorrect subexpressions).

Harder but still conceivable is that a syntax directed editor for a vertical syntax for

BMASF is specified in ASF and that the ASF system is used to prototype such an environ-

merit. Further, one can imagine a s ~ ' i c a t i o n in ASF of a system that realizes an operational

interpretation of a BMASF specification.

370

Also one may specify in ASF what exactly has to be done if separate type checking of a

modular BMASF specification is to be realized.

5. PRELIMINARIES FOR THE B M A S F SPECIFICATION

We need conventions for the use of names. These conventions are not given in the system

ASF but should be workable when translating the specification to ASF.

5.1 SORTS AND LISTS

Primitive sorts are named with identifiers made from one or more capital letters and from

digits always beginning with a letter. Some constructed sort names may involve brackets.

We will use only one sort constructor for lists. The underlying view is, that the algebra of

sorts is a partial one and sort expressions are only defined if the axioms imply this. So the

axiom schemes to be defined in FPN0, FPN1, FPN2 should be instantiated for every list or

cartesian product used further on in the specification.

The letter L always means that the sort denotes finite lists of another sort. So L(AB1 )

denotes the sort lists of AB1. If such lists are needed, the sort must be explicitly specified,

and then the constructor operations and the empty list will come by default. We are using the

notation of [B187] and [M 86] for lists. When needed, extensions of the 'automatic signature

introduction mechanism' can be defined in order to allow more of the operators of [BI 87]

and [M 86] to be used without declaration. A useful subset of these notations is selected in [J

89]: prefix, drop, map, reduce, right reduce, and transpose. Because none of the concrete

notations proposed by Bird and Meertens overlaps in an unpleasant way with notations that

we intend to use or to import from previous work on module algebra (or process algebra), we

will make sure that these notations will not be overloaded in our proposals with quite dif-

ferent meanings as well, so that some notational consistency can be achieved in the end. Our

subset is collected in the module FPN2; further extensions can be coded in extension mod-

ales when needed. Notice that further extensions require that standard names are introduced

for function types just as for list types. So it is plausible to denote the type of functions from

X to Y by F(X, Y). We get elements of this function space by an operator ̂ applied to a func-

tion from X to Y.

The predefined lists of X are always structured by means of the following operators:

[ ] the empty list of X-objects;

[ . .] embedding X into L(X);

+v associative concatenation of lists;

_:_ prefixing a list with an object.

In addition, lists of fixed length can be denoted with the following notation:

[a, b, e, d] denotes a list with elements a, b, c, and d.

371

Notice that the list notation provides no type information. This implies a substantial over-

loading of the notation that must be resolved by making sure that the type of listed objects is

clear without context information.

The reduce operator / must be applied to a function h from pairs of objects to objects (i.e.

h: X x X ~ X). The value of reduce of a function on the empty fist,/h([ ]), must be an ele-

ment of X. This element must be supplied, in each ease, as the third argument of the reduce

function, so reduce will have three arguments, viz. a function (element of F(CP(X, X), X)), a

list (element of L(X)) and an element (of X).

In every application, h is commutative, associative, and has a unit element equal to/h([ ])

(i.e. h(x,/11([ ])) -- x). The intuition is that reduce applies h to the list 'consecutively', so e.g.

/(+, [3,4,6], 0) = 3 + 4 + 6.

5.2 FUNCTION SPACES

By convention, the type of functions from X to Y is denoted by F(X,Y). We get elements of

this function space by the operator._A applied to a function from X to Y.

5.3 NATURAL NUMBERS AND BOOLEANS

There is a fixed sort NAT with function succ and constant 0.The cardinality function #: L(X)

--4 NAT is automatically introduced with every list sort. We also have a fixed sort BOOL

with functions &,v,--, and constants T,F.

5.4 CONSTANT NAMES AND FUNCTION NAMES

Constant and function names can be systematically disambiguated by subscripting them with

the sorts of their arity. These subscripts may be skipped in the presentation of a specification

as long as disambiguation is possible in an unambiguous way (or: even more liberal but also

less clear in its consequences, as long as all correct disambiguations that can be imagined can

be proven equivalent by means of the axioms).

module FPN0 begin

begin signature L(X) [ ]: L(X)

: : X x L(X) ~ L(X) : L(X) x L(X) ~ L(X)

[1: X --> L(X) [_,...1

list manipulation operators

sort of lists of X-objects (X is a parameter)

empty list

pref'ming a Iist by an object

concatenation of lists

embedding of X into L(X)

constructor notation scheme for finite name lists

with flexible afity

372

#: L(X) --~ NAT length of list (automatically generated with

L(X) and NAT)

_~_: X x L(X) ---) BOOL element of a list

eqL(X)xL(X)~BOOL: L(X) x L(X) --+ BOOL equafity on lists

end signature

begin equations

variablesx,y,yl ..... Yn ~ X, t,m ~ L(X)

1 x:[ ] = Ix]

2 n x:[yl ..... Yn] = [x,yl ..... Yn] (n E N)

3 [ ] +~1=1

4 (x:l) ~- m = x:(I ~ m)

5 #([ ]) = 0

6 #(x:l) = SUCC(#(I))

7 X E [ ] f F

8 X ~ y:l = eq(x,y) v x~ I

9 eq( [ ] , [ ] ) = T

10 eq([ ] ,x:l) = F

11 eq(x:l,[ ]) = F

12 eq(x:l,y:m) = eq(x,y) & eq(I,m)

end equations

end module FPN0

13

module FPN1 cartesi~ products

begin

begin signature

CP(S1 ..... Sn) Cartesian product of sorts $1 to Sn

(_,...,_):$1 x ... x Sn ~ CP(S1 ..... Sn) construction of n-tuple

~k:CP(S1 ..... Sn) --) Sk k-th component (for all k with 1 <_k~n)

end signature

begin equations

variablessi ~ Si (i=1 ..... n)

~ ( ( s l ..... sk ..... Sn)) = sk

end equations

end module FPN1

k=l ..... n

module FPN2

begin

functional programming constructions

373

begin signature

F(X, Y)

U ̂ : F(X,Y) _¢..}: F(X, Y) x X ~ Y

_o_: F(X, Y) x F(Y, Z) ~ F(X, Z)

*_: F(Xl Y) x L(X) ~ L(Y) /: F(CP(X,X),X) x L(X) x X ~ X

end signature

begin equations

functions from X to Y (parameters X,Y)

embedding in function space, for U: X --~ Y function application

function composition

map

reduce

variables x,x'~ X, y~ Y, f~ F(X,Y), g~ F(Y,Z), h~ F(CP(X,X), X), 1~ L(X), U: X ~ Y 14 u^¢x = u(x) 15 fog~x} -- f~g~x}}

16 *f([ ]) = [] 17 *f(x:l) = f~x}:*f(I)

18 /(h, [ ], x') = x' 19 /(h, x:l, x') = h~x,/(h, I, x')}

end equations

end module FPN2

20

21

22 23

24 25

module BOOLEANS

begin

begin signature BOOL

T: BOOL

F: BOOL

_&_" BOOL x BOOL -~ BOOL

_v_: BOOL x BOOL -~ BOOL

--1 : BOOL -~ BOOL

end signature

begin equations

variables b,c ~ BOOL T v b = T

b v T = T F v F = F

-~T=F -=F = T b & c = ~((~b) v (-=c))

sort of booleans

true

false

conjunction

disjunction

negation

26

27

28

29

374

end equations

end module BOOLEANS

module NATURALS

begin

begin signature

NAT

0: NAT

succ: NAT --~ NAT

eq: NAT x NAT --) BOOk

end signature

begin equations

variables n,m ~ NAT

eq(n,n) = T

eq(0, succ(n)) = F

eq(succ(n), 0) = F

eq(succ(n), succ(m)) = eq(n,m)

end equations

end module NATURALS

natural numbers

z e r o

S u c c e s s o r

equality

Note that when one writes an equation it must be ensured that this equation has a proper typ-

ing. Therefore an equation such as

X = [succ(0)] ~ [succ(succ(0)), X]

is unreadable because there is no type assignment for X.

6. SPECIFICATION OF B M A S F

Now we will start the design of the specification language BMASF. BMASF combines Basic

Module Algebra (see [BHK 88]) with ASF. Figure 1 shows part of the signature of the first

module, the module ELEMENTS (renamings, and signature of NAT and BOOk are not

In the equations of this module, we will use recta-variables. These meta-variables range

over a finite set of (regular) variables, and are used to cut down on the number of equations.

It is slraightforward to expand the equations in which recta-variables occur, in order to elimi-

nate them.

ELEMENTS defines the basic elements of the language: general names, sort declara-

tions, constant declarations, relation declarations, function declarations and variable ~leclara-

tions. Furthermore, equalities and renamings are defined on these basic elements. The re-

375

naming functions are defined with the help of a transposition function ~ with three argu-

ments. The result of the transposition function is the value of its third argument, if the first

two arguments are swapped (transposed).

FIGURE 1. Part of the signature of ELEMENTS

module ELEMENTS

begin

begin signature

GN sort of (general) names

eqGNxGN~BOOL: GN x GN --~ BOOL equality on names

iNAT~GN: NAT ~ GN embedding of numbers in names

CGNxGNxGN~GN: GN x GN x GN -~ GN name transposition

OGNxGNxL(GN)-~GN; GN x GN x L(GN) ~ L(GN) name transposition on a list

SD sort of sort declarations

S--: GN ~ SD sort names

eqsDxSD~BOOL: SD x SD --~ BOOL equality on sorts

CD sort of constant declarations

C: "._: GN x SD --) CD constant names

eqCD×CD~BOOL: CD x CD --~ BOOL equality on constants

30

31

32

33

34

35

36

37

38

39

376

RD sort of relation declarations

R:_:._: GN x L(SD) --* RD relation names

eqRDxRD~BOOL: RD x RD ---) BOOL equality on relations

FD sort of function declarations

F: : --~ "GN x L(SD) x SD ---> FD function names

eqFDxFD~BOOL: FD x FD ---> BOOL equality on functions

VD sort of variable declarations

V:_:_: GN x SD --~ VD variable names

eqvoxVD~BOOL: VD x VD --> BOOL equality on variables

ATREN sort of atomic renamings

id: ATREN identity renaming

rSDxGN~ATREN: SD x GN ~ ATREN

FCDxGN._)ATREN: CD x GN --> ATREN

rFDxGN~ATREN: FD x GN ~ ATREN

rRDxGN-~ATREN: RD x GN ---> ATREN

rVDxGN~ATREN: VD x GN ~ ATREN

_ '_ ATRENxSD--)SD: ATREN x SD --> SD

_'..ATRENxCD~CD: ATREN x CD -~ CD

_'_ATRENxFD--)FD: ATREN x FD ~ FD

_'_ATRENxRD-~RD: ATREN x RD ~ RD

._'_ATRENxVD-WD: ATREN x VD --> VD

end signature

begin equations

transposition of sort names

transposition of constant names

transposition of function names

transposition of relation names

transposition of variable names

application of atomic renaming




application of atomic rcnamlng

variables k,k' e NAT, n,m,n',m' e GN, e e L(GN), s,s',t,t' e SD, l,j e L(SD), c e CD,

fe FD, qe RD, ve VD, rn e ATREN

meta-variables q~,V

eq(i(k),i(k')) = eq(k,k')

~(n,m,n) = m

a(n,m,m) = n

eq(n;n') = F & eq(m,n') = F = ~(n,m,n') = n'

~(n,m,[ ]) = [ ]

~(n,m,n':e) = a(n,m,n'):~(n,m,e)

eq(S:n, S:rn) = eq(n,m)

eq(C:n:s, C:m:t) = eq(n,m) & eq(s,t)

eq(R:n:l, R:m:j) = eq(n,m) & eq(I,j)

eq(F:n:l--->s, F:m:j~t) = eq(n,m) & eq(s:l,t:j)

40

377

eq(V:n:s, V:m:t) = eq(n,m) & eq(s,t))

41 r(S:n, n) = id 42 r(C:n:s, n) = id 43 r(R:n:l, n) = id 44 r(F:n:l---~s, n) -- id 45 r(V:n:s, n) = id 46 r(S:n, m) -- r(S:m, n) 47 r(C:n:s, m) = r(C:m:s, n) 48 r(R:n:l, m) = r(R:m:l, n) 49 r(F:n: l~s, m)= r(F:m:l-~s, n)

50 r(V:n:s, m) = r(V:m:s, n) 51 id.q~ = q~ for ~o ~ {s,c,f,q,v} 52 r(S:n, m).(S:n') = S:a(n,m,n') 53 r(S:n, m).(C:n':(S:m')) = C:n':(S:~(n,m,m')) 54 r(S:n, m).(F:n': (*S^:e) ~ S:m') -- F:n': (*S^:c(n,m,e)) -~ S:~(n,m,m') 55 r(S:n, m).(R:n': (*S^:e)) = R:n': (*S^:~(n,m,e)) 56 r(S:n, m)-(V:n':(S:m')) = V:n':(S:~(n,m,m'))

57 r(C:n:s, m)-(O:n':s) = C:c(n,m,n'):s 58 eq(s,t) = F ~ r(C:n:s, m).(C:n'.l) = C:n'.'t

59 r(F:n:l--,s, m).(F:n': l~s) = F:~(n,m,n'): l~s

60 eq(l:s, t:j) = F ~ r(F:n:l~s, m).(F:n':j~t) = F:n': j~t 61 r(R:n:l, m).(R:n':l) = R:c(n,m,n'):l 62 eq(I,j) --- F ==:, r(R:n:l, m).(R:n':j) = R:n':j 63 r(V:n:s, m).(V:n':s) = V:o'(n,m,n'):s 64 eq(s,t) = F =~ r(V:n:s, m)-(V:n':t) = V:n':t 65 rn(q~,n)-~'=~ for~E {c,f,q,v},'.F~ {s,c,f,q,v},q~" 66 rn.(rn.q~) = q~ forq~ ~ {s,c,f,q,v}

end equations end ELEMENTS

Next, we will describe signatures. First, we list the module ATOMICSIGNATURES, in

which we embed the elements of the previous module. In turn, we embed the module

ATOMICSIGNATURES into SIGNATURES. A picture of their signature is fig. 2, be-

tween the two modules.

module ATOMICSIGNATURES begin

67

68

69

70

71

72

73

74

378

begin signature

ATSIG

iSD--eATSIG: SD -+ ATSIG

iCD_eATSIG: CD --~ ATSIG

iRD.->ATSIG: RD --> ATSIG

iFD~ATSIG: FD ~ ATSIG

iVD--eATSIG: VD --> ATSIG

sort of atomic signatures

embed sort declaration as atomic signature

embed constant declaration as atomic signature

embed relation declaration as atomic signature

embed function declaration as atomic signature

embed variable declaration as atomic signature

eqATSIGxATSIG-)BOOL: ATSIG x ATS1G ---> BOOL equality on atomic signatures

_'_ATRENxATSIG-)ATSIG: ATREN x ATSIG --e ATSIG application of at. renaming end signature

begin equations

variables s,t E SD, c,d e CD, r,q e RD, f,g E FD, v,w ~ VD, n E GN, rn e ATREN,

aE ATSIG

meta-variables £o,~

eq(i(s), i(t)) = eq(s,t)

eq(i(c), i(d)) = eq(c,d)

eq(i(r), i(q)) = eq(r,q)

eq(i(f), i(g)) = eq(f,g)

eq(i(v), i(w)) = eq(v,w)

eq(i(~), i(~y) ) = F fo rq~ ,~ {s,c,q,f,v}, (p;e, y

rn'i((1)) = i(rn.q)) for(pc {s,c,f,q,v}

rn.(rn.a) = a

end equations

end module ATOMICSIGNATURES

ATSa SIG ~ n

eq

FIC-U~ 2. Signature of ATOMICStGNATURES, SIGNATURES

75

76

77

78

79

80 81

82

83

84

85

86

87

88

89

90

379

module SIGNATURES

begin

begin signature SIG sort of signatures

iATSIG-~SlG: ATSIG --~ SIG conversion of atomic signatures into Signatures

_+.._SIGxSIG--)SIG: SIG x SIG --~ SIG signature combination

OSIG: S IG empty signature

eqSlGxSIG~BOOL: SIG x SIG ~ BOOL equality on signatures n _ " SIG x SIG --4 SIG intersection

_~_ATSlG×SIG---~BOOL: ATSIG x S1G ~ BOOL element of a signature _A_" ATSIG x SlG ~ S IG deletion of an element of a signature

_~...: SIG x SIG ~ BOOL signature inclusion

_'_ATRENxSIG---)SIG" ATREN x SIG ~ SIG application of atomic renaming

~-ATREN--~SIG: ATREN --~ SIG signature of an atomic renaming invZ: ATREN ~ SIG sorts mentioned in, but invariant under renaming

end signature

begin equations variables x,y,z ~ SlG, n,m e GN, s,t e SD, 1: L(SD), c,d e CD, r,q e RD, f,gE FD,

v ,we VD, a~ ATSIG, rne ATREN

meta-variables cp,~

X+O=X

X+X--X

x + y - - y + x

( x + y ) + z - x + ( y + z )

i(i(C:n:s)) = i(i(C:n:s)) + i(i(s)) i(i(R:n:l)) -- i(i(R:n:l)) +/(+^, *(i^oi^)(I), 0 )

i(i(F:n:l-~s))) = i(i(F:n:l~s)) +/(+ ̂ , *(i^oi^)(s:l), 0 )

i(i(V:n:s)) = i(i(V:n:s)) + i(i(s))

a~ ~ - - F

i(s) E i(i(t)) = eq(s,t)

i(s) E i(i(C:n.'t)) = eq(s,t)

i(s) ~ i(i(R:n:l)) = sel

i(s) ~ i(i(F:n:l~t)) = s ~ t:l

i(s) e i(i(V:n.-t)) = eq(s,t)

i(~p) E i(i(~))= eq(i(cp), i(~)) forcp ~ {c , r , f , v } ,~ {s,c,r,f,v} a~ ( x + y ) = a ~ x v a ~ y

380

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

"114

115

116

117

11g

119

120

121

122

123

124

125

126

127

128

x n O = O

XAX----X

x n y = y n x

( x n y ) n z = x n ( y n z )

i(s)e x = F =~ i(i(s)) n x = O

i(C:n:s)ex = F ~ i(i(C:n:s)) n x = i(i(s)) n x

i(R:n: l)ex = F =~ i(i(R:n:l)) n x =/(+^, *(iôi^)(I), O) n x

i (F:n: l~s)e x = F ~ i(i(F:n:l-->s)) n x =/(+^, *(iôi^)(s:l), 0 ) n x

i(V:n:s)e x = F =~ i(i(V:n:s)) n x = i(i(s)) n x

a e x = T ~ i ( a ) n x = i ( a )

(x +y) n z = ( x n z ) + ( y n z )

a ~ x = F ~ a A x = x

i(s) A i(i(s)) = O

i(s) A i(i(C:n:s)) =

sel = T ~ i(s) A i(i(R:n:l)) = i(s) A/(+^, *(iôi^)(I), ~ )

s~t: l = T ==~ i(s) A i(i(F:n:l-->t)) = i(s) A / (+ ^, *(iôi^)(t:l), O)

i(s) A i(i(V:n:s)) =

i(C:n:s) A i(i(C:n:s)) = i(i(s))

i(R:n:l) A i(i(R:n:l)) = / (+^ , *(iôi^)(I), O)

i(F:n:l-~s) A i(i(F:n:l-->s)) =/(+^, *(iôi^)(s:l), ~ )

i(V:n:s) A i(i(V:n:s)) = i(i(s))

a A (x + y) = aAx + aAy

x = y + z = x _ D y = T

a ~ y = T & a ~ x = F ~ x ~ y = F

eq (x , y )=x_Dy & y_Dx

rn.g~=~

rn.i(a) = i(rn-a)

rn.(x + y) --- rn.x + rn.y

rn-(rn-x) = x

7.(id) =

eq(n,m) = F

eq(n,m) = F

eq(n,m) = F

eq(n,m) = F

eq(n,m) = F =~

inv~:(r(s,n)) = O

eq(n,m) = F

eq(n,m) = F

%(r(S:n,m)) = i(i(S:n)) + i(i(S:m))

==> ~(r(C:n:s, m)) = i(i(O:n:s)) + i(i(O:m:s))

==~ 7.(r(F:n:l-->s, m)) = i(i(F:n:l-->s)) + i ( i (F:m:l~s))

T.(r(R:n:l, m)) = i(i(R:n:l)) + i(i(R:m:l))

~.(r(V:n:s, m)) = i(i(V:n:s)) + i(i(V:m:s))

invT.(r(C:n:s, m)) = i(i(s))

inv%(r(F:n:l,s, m)) = i(i(s)) +/(+^, *(iôi^)(I), ~ )

129 130

381

eq(n,m) = F ~ invY.(r(R:n:l, m)) =/(+^, *(i^oi^)(I), 0)

eq(n,m) = F ~ invY.(r(V:n:s, m)) = i(i(s))

end equations

end module SIGNATURES

The equations for SIG NATURES express the fact that combination of signatures behaves

like set union with O as the empty set. Moreover there are axioms that imply that signatures

are closed i.e. that sorts occurring in the arity of a function or relation also occur in the sig-

nature (79 - 82). The axioms here are modeled after the set in ~ H K 88].

In the next module, we define expressions over a certain signature.

FIGURE 3. Signature of EXPRESSIONS

module EXPRESSIONS

begin

begin signature

EXP

default: SIG x SD ~ E X P

iVD~EXP: VD --~ EXP

iCD~EXP: CD --~ EXP apf: FD x L(EXP) ~ EXP

EEXP._)SIG: EXP ~ SIG

S: EXP ~ SD

sort of expressions

default value of an expression

(wrongly typed expression with base signature and sort)

conversion of variable to expression

conversion of constants to expressions

application of a function to an expression list

signature of expression

sort of an expression

[_/_]VDxEXP-~F(EXP,EXP): VD x EXP ~ F(EXP,EXP)

substitution of an expression for a variable

_'...ATRENxEXP--~EXP: ATREN x EXP --~ EXP application of atomic renaming

_L_ATRENxL(EXP)--~L(EXP): ATREN x L(EXP) --r L(EXP) application of renaming

382

end signature

begin equations

variables n~ GN, s,tE SD, fE FD, x~ SIG, e~ EXP, k~ L(EXP), I~ L(SD), v,w~ VD,

c~ CD, rn ~ ATREN

131 S(i(V:n:s)) = s

132 S(i(C:n:s)) = s

133 S(default(x,s)) = s

134 eq(I, *S^(k)) = F ~ apf(F:n:l-->s, k) = default(i(i(F:n:l-->s)) + / (+^, *~.^(k), ~ ) , s)

135 S(apf(F:n: l ->s, k)) = s

136 ~(defauit(x,s)) = x

137 T_,(i(v)) = i(i(v))

138 E(i(c)) = i(i(c))

139 I:(apf(f, k)) = i(i(f)) +/ (+^, *T-,^(k), O) 140 i(V)E X = T ~ [v/e]{default(x,s)} = default(( i(v) A x) + E(e), s)

141 i(v)~ x = F ~ [v/e]~default(x,s)} = default(x,s)

142 [v/e]{i(v)]) = e

143 eq(v,w) = F ~ [v/e]0(w)} = i(w)

144 [v/e]0(c)]) = i(c)

145 [v/e]{apf(f, k)) = apf(f, *[v/e](k))

146 rn.default(x,s) = default(rn.x, rn.s)

147 rn-i(v) = i(rn-v)

148 rn.i(c) = i(m-c)

149 r n - [ ] = [ ]

150 rn.(e:k) = (m.e):(m.k)

151 rn.apf(f, k) = apf(rn.f, rn.k)

152 rn.(m-e) = e

end equat ions

end module EXPRESSIONS

The axioms for expressions must define the visible signature of all expressions, taking into

account the role that the signature of an expression is just the collection of all constants, vari-

ables, relations and function symbols that occur in it. An incorrectly typed expression should be equated with the default expression of the corresponding signature and sort (134). One

needs that for all closed expressions of type EXPRESSION the visible signature can be cal-

culated.

Next, we look at formulas over these expressions.

383

and / ' ~ or non

T -~V" \ '

. . . . > ( FOR " ~ o r a l l :FV

[ . / . 1

FIGURE 4. Signature of FORMULAS

module FORMULAS

begin

begin signature

FOR

TStG-eFOR: SIG --> FOR

FSlG~FOR: SIG -->FOR apr: RD x L(EXP) -~ FOR

eqfor: EXP x EXP --> FOR

forall: VD x FOR --> FOR

exists: VD x FOR --> FOR

_and_: FOR x FOR ~ FOR

_or_- FOR x FOR ---> FOR

non: FOR --+ FOR

implies: FOR x FOR -e FOR

EFOR-eSIG: FOR ~ SIG

sort of formulas

constant formula true with signature

constant formula false with signature

application of a relation

atomic formula equating two terms

universal quantification

existential quantification

conjunction

disjunction

negation

implication

signature of formula

L/_]VDxEXP-~F(FOR,FOR): VD x EXP ---> F(FOR,FOR)

substitution of an expression for a variable

FREE: VD x FOR --> BOOk variable is free in formula

_'_ATRENxFOR-eFOR: ATREN x FOR -e FOR application of atomic renaming end signature

begin equations

variables x e SIG, k e L(EXP), s e SD, r e RD, I e L(SD), n E GN, p,q,e e EXP,

f,g,h e FOR, v,we VD, rn E ATREN

384

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174 175

176

177

178

179

180

181

182

183

184

185

186

187 188

189

t 9 0

eq(I, *SACk)) - F ~ apr(R:n:l, k) = F(i(i(R:n:l)) +/ (+^, *Y_.^(k), 0 ) )

7_.(T(x)) = x

~(F(x)) = x

~(apr(r, k)) = i(i(r)) + / (+^, *7_.^(k), 0 )

T.(eqfor(p,q)) = T.(p) + T.(q)

Y.(forall(v, f)) = i(i(v)) A 7_.(f)

Y.(exists(v, f)) = i(i((v)) A :~(f)

7_.(f and g) = y.(f) + y.(g)

7_.(f or g) = ~(f) + 7_.(g)

7_.(non(f)) = 7_,(f)

7_.(implies(f,g)) = y.(f) + Z,(g)

non(T(x)) = F(x)

non(F(x)) = T(x)

non(f) or f = T(7_.(f))

non(f) and f = F(T_.(f))

implies(f, g or f) = T(Y.(f) + 7_.(g))

f o r f = f

f o r g = g o r f

implies((f or g) and (non(f) or g), g) = T(:E;(f) + T_.(g))

f and g = non(non(f) or non(g))

implies(f,g) = non(f) or g

f and implies(f,g) = (f and implies(f,g)) and g i(v)e x = T ~ [v/e]~r(x)~ = T((i(v) A x) + y.(e))

i(v)~ x = F ~ [v/e]~(x)~) = T(x)

i (v)Ex = T =~ [v/e]~F(x)} = F((i(v) ~, x) + 7_.(e))

i (v )~x = F =~ [v/e]([F(x)}= F(x)

[v/e]~apr(r, k)~) = apr(r, *[v/el(k))

[v/e]~eqfor(p,q)} = eqfor([v/e]l~p}, [v/e]~q})

eq(v,w) = T ~ [v/e]~forall(w, f)} = forall(w,f)

eq(v',w) = F & FREE(w,e) = F ~ [v/e]~forall(w, f)) = forall(w, [v/e]~f~)

eq(v,w) = T ~ [v/e]i~exists(w, f)~) = exists(w,f)

eq(v,w) = F & FREE(w,e) = F ~ [v/e]~exists(w, f)~) = exists(w, [v/e]~f}))

[v/e]~f or g) = [v/e]~f} or [v/e]([g}

[v/e]~non(f)) = non([v/e]~f~)

FREE(v, f) = i(v) ~ 7-.(0 FREE(w, f) = F & eq(S(i(v)),S(i(w))) = T ~ forall(v, f) = forali(w, [v/i(w)]([f))

FREE(w, f) -- F & eq(S(i(v)),S(i(w))) = T ~ exists(v, f) = exists(w, [v/i(w)]l~f~)

eqfor(f,f) = T(Y.(f))

385

191 implies(eqfor(p,q) and [v/p](f}, [v/q](f}) = T((i(i(v)) A E(f)) + (F..(p) + Y,(q)))

192 forall(v, f) -- non(exists(v, non(f)) t 93 FREE(v, e) = F ~ implies([v/e]~[f}, exists(v, f)) = T((i(i(v)) A Z,(f)) + 7.(e))

194 FREE(v,g) = F ~ implies(implies(f,g), implies(exists(v,f), g)) = T(E(f) + 7(g))

195 rn-T(x) = T(rn-x)

196 rn.F(x) = F(rn.x)

197 rn.apr(q, k) = apr(rn-q, rn-k)

198 rn.eqfor(p,q) = eqfor(rn.p, rn.q)

199 rn.(f or g) = rn-f or rn.g

200 rn.(non(f)) = non(rn.f)

201 rn-(exists(v, f)) = exists(rn.v, rn.f)

202 rn.(rn.f) = f

end equations

end module FORMULAS

The situation with formulas is comparable with that of expressions, be it that the role of the

default formula is now played by F(x) for the right signature (153). Moreover, the visible

signature must be defined for all formulas. Then we need axioms that allow c~-conversion of

variables bound by existential and universal quantification (188, 189). Further, all axioms of

predicate logic can be coded in the format of module algebra by writing them as equivalences

between conjunctions of axioms (164 - 174, 190 - 194). In the following section on mod-

ules, one will find an axiom that allows us to split an atomic module consisting of the con-

junction of two formulas, in a combination of atomic modules (203). Thus it will be possible

to remove the long conjunctions that are generated by the coding of predicate logic inference

rules in this section.

module MODULES

begin

begin signature

ASM sort of algebraic specification modules

O : FOR --> ASM atomic module

TSIG~ASM: SIG --> ASM embedding of signatures in ASM

._+_ASMxASM-->ASM: ASM x ASM ---> ASM combination of modules

[] : SIG x ASM -~ ASM export operator

7.ASM..~SIG: ASM --> SIG visible signature

_'__ATRENxASM~ASM: ATREN × ASM ~ ASM application of atomic renaming

end signature

386

<.)

[] +

I~OURE 5. Signature of MODULES.

203

204

2O5

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

begin equations

variables f,g ~ FOR, u,v ~ SIG, X,Y,Z ~ ASM, m ~ ATREN

(f and g) = (f) + (g)

,~,((0) = ,T,(f)

£(T(u)) = u

z(x + Y) = z(x) + z(Y)

~.(uuX) = unZ(X)

X + Y = Y + X

( X + Y ) + Z = X + ( Y + Z )

T(U + v) = T(U) + T(v)

X + T()".(X)) = X

X + (unX) = X

z (X)nX = X

uu(vuX) = (unv)uX

Ul3(T(v) + X )= T(unv) + (uOX)

u D F_,(X)nY,(Y) = uD(X + Y ) = (uOX) + (unY)

T,,(rn.X) -- rn.Y,(X)

rn-(g) = (rn-g)

rn.T(u) = T(rn.u)

rn.(X + Y) = rn.X + rn.Y

rn.(uoX) -- (rn.u)o(rn.X)

rn.(rn.X) = X

223

224

225

226

227

228

229

230

231

232

387

7.(rn) c~ ~(X) = inv~(rn)

end equations

end module MODULES

=~ rn-X = X

These equations for algebraic specification modules require just the axioms of module algebra for the part of the signature written above (204 - 223). For comments on these axioms see

[BHK 88].

Next, we look at declarations. When we combine declarations with algebraic specification

modules, we extend the signature of such modules, and we will have to re-examine some

axioms. A picture for the signature of the modules DECLARATIONS and ENVIRON-

MENTS can be found between the two.

module DECLARATIONS

begin

begin signature

DE

(asm:_ = _): GN x ASM --> DE

(sig:_ =_): GN x SlG --> DE

(book_ = __): GN x BOOL -~ DE

imPGN-~SIG: GN ~ SIG

impGN~ASM: GN --~ ASM

_+_..DExDE~DE: DE x DE --> DE

(~DE : DE

sort of declaration environments

asm declaration

signature declaration

boolean declaration

import signature expression

import module expression

combination of declaration environments

empty declaration environment

_'--ATRENxDE-->DE: ATREN x DE ~ DE application of atomic renazmng

_'_ATRENxBOOL~BOOL: ATREN x BOOL --> BOOL application of at. renaming end signature

begin equations

variables p,q,q' e DE, X e ASM, u e SIG, b e BOOL, n e GN, rn e ATREN p + ~ = p

P + q = q + p (p +q) + q ' = p + (q +q')

P + p = p rn.(asm: n=X) = (asm~ n=rn.X)

rn.(sig: n=u) = (sig: n=rn.u)

rn-(bool: n=b) = (book n=rn-b)

m-(D + E) = rn.D + rn.E

rn .O=~

233

234

388

rn.T-- T

rn-F= F

end equations

end module DECLARATIONS

+

f _ I L. - DE

FIGURE 6. Signature of DECLARATIONS, ENVIRONMENTS

The main equation for the abbreviation mechanism is a body replacement axiom that allows to

replace a name with the module that it stands for (260, 26I). Renamings have to be specified

as well for the abbreviation declaration. Notice that since we allow declarations with module

expressions, we also will get them with each target sort of ASM. In this case, this is only

SIG, which in turn has BOOk as a target sort. The axioms are phrased so that the construct

DASM (and DSIG, DBOOL) can be moved to the outside, so that each module expression

can be written in the form DASM(p, X), where X contains no declarations.

module ENVIRONMENTS

begin

begin signature

DASM: DE x ASM---~ ASM

DSIG: DE x SIG ~ SIG

DBOOL: DE x BOOL --~ BOOL

end signature

begin equations

variables X,Y e ASM, u,v,x e SIG, b,c e BOOL, p,q • DE, s e SD, v • VD, n • GN,

a.s.module with declarations

signature with declarations

boolean with declarations

389

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

25O

251

252

253

254

255

256

257

258

259

I e L(SD), a e ATSIG, e e EXP, rn e ATREN

DASM(O, X) = X

DSIG(~, u) = u

DBOOL(~, b) = b

DASM(p, DASM(q, X)) = DASM(p + q, X)

DSIG(p, DSIG(q, u)) = DSIG(p + q, u)

DBOOL(p, DBOOL(q, b)) = DBOOL(p + q, b)

DASM(p, X) + DASM(q, Y) = DASM(p + q, X + Y)

DSIG(p, u) + DSIG(q, v) = DSIG(p + q, u + v)

DSIG(p, u) n DSIG(q, v) = DSIG(p + q, u n v)

DBOOL(p, b) & DBOOL(q, c) = DBOOL(p + q, b&c)

DBOOL(p, b) v DBOOL(q, c) = DBOOL(p + q, bvc)

-,DBOOL(p, b) = DBOOL(p, ~b)

~(DASM(p, X)) = DSIG(p, 7.(X))

T(DSIG(p, u)) = DASM(p, T(u))

eq(DSIG(p, u), DSIG(q, v)) = DBOOL(p + q, eq(u,v))

a E DSIG(p, u) = DBOOL(p, ae u)

a A DSIG(p, u) = DSIG(p, aAu)

DSIG(p, u) ~ DSIG(q, v) = DBOOL(p + q, u~v)

DSIG(p, u) [] DASM(q, X) = DASM(p + q, u El X)

(asm: n = DASM(p,X)) = p + (asm: n = X)

(sig: n = DSIG(p,u)) = p + (sig: n = u)

(bool: n = DBOOL(p,b)) = p + (bool: n = b)

rn.DASM(p, X) = DASM(rn-p, rn.X)

rn.DSIG(p, x) = DSIG(rn.p, rn-x)

m.DBOOL(p, b) = DSIG(rn.p, rn.b)

260

261 DASM((asm: n = Y), imp(n)) = DASM((asm: n = Y), Y)

DSlG((sig: n = u), imp(n)) = DSlG((sig: n = u), u)

262

263

264

265

266

267 268

269

i(s)ex = DBOOL(p, F) ~ i(i(s)) n x = O

i(C:n:s)~x = DBOOL(p, F) =~ i(i(C:n:s)) n x = i(i(s)) n x

i(R:n:l)ex = DBOOL(p, F) =~ i(i(R:n:l)) n x =/(+^, *(i^oi^)(I), 0 ) n x

i(F:n:l--->s)ex = DBOOL(p, F) ==~ i(i(F:n:l-->s)) n x=/ (+^, *(i^oi^)(s:l), 0 ) n x

i(V:n:s)ex = DBOOL(p, F) ~ i(i(V:n:s)) n x = i(i(s)) n x

aex = DBOOL(p, T) ~ i(a) n x = i(a)

i(v)e x = DBOOL(p, T) ~ [v/e](default(x,s)) = default((i(v) A x) + ,7..(e), s)

i(v)e x = DBOOL(p, F) ~ [v/e](default(x,s)} = default(x,s)

390

270 i(v)ex = DBOOL(p, T) ~ [v/e](T(x))= T((i(v)A x) + E(e))

271 i(v)e x = DBOOL(p, F) =~ [v/e](T(x)) = T(x)

272 i(v)e x = DBOOL(p, T) =, [v/e](F(x)) = F((i(v) A x) + Z(e))

273 i(v)e x = DBOOL(p, F) ~ [v/e](F(x)) = F(x)

274 DSIG(p, u) ~ ~.(X) n Z(Y) =~ u [] (X + Y) = (u [] X) + (u [] Y)

275 z(rn) n Y.(X) = DSIG(p, inv~.(rn) + v) =, rn. X = X

end equations

end module ENVIRONMENTS

7. MODELS OF B M A S F

In this section, we give some heuristics on how to construct models for BMASF.

BMASF is a large specification and it is almost impossible to provide an elaborate ac-

count of its consistency and a survey of its various models in a reasonable and theoretically

meaningful way. In particular, proving that it is a complete TRS or an attempt towards com-

pletion seems pointless to us. Semantically, one may say that the specification has an initial

algebra and one needs some information about its structure and its homomorphic images.

The first step in the right direction is to view BMASF as an increasing sequence of speci-

fications $1, $2 . . . . . Sk. Here, we discuss the following division:

91 = modules ELEMENTS, ATOMICSIGNATURES and SIGNATURES;

$2 = $1 + module EXPRESSIONS;

$3 = $2 + module FORMULAS;

$4 = $3 + module MODULES;

$5 = $4 + module DECLARATIONS;

$6 = $5 + module ENVIRONMENTS.

7.1 SIGNATURES

The initial algebra I(S1) is unproblematic, it can be described in terms of simple set theoretic

constructions given the point of view that a signature is more or less a set of atomic signa-

tures. Thus, the general names are to be interpreted by the natural numbers (one can choose

to have character strings, instead), for the various names one takes the appropriate kinds of

pairs and tuples, and for the atomic signatures one takes the finite sets of the declarations of

ingredients.

There is no reason to investigate any of its homomorphic images because of this, and one

would be satisfied if I(S1 ) is the final algebra of $1 as well (which is probably true and

otherwise can be made true by means of a harmless extension of the axioms).

391

7.2 EXPRESSIONS

The initial algebra of $2 is an expansion of I(S1). To validate this, one must prove that ev-

ery closed identity between terms over the signature of S1 provable from S2 is already

provable from $1 (no confusion) and that every closed term over the signature $2 of a sort

that occurs in the signature of $1 is provably equal to a closed term over the signature of S1

(no junk). One may prove the first fact in a model theoretic way by finding a model of $2

that is an expansion of the initial algebra of S1. The second fact requires inspection of the

equations. By viewing them as rewrite rules one can prove with induction on the smlcture of

closed expressions (of sort EXP) that S and Z yield known values.

The model that proves 'no confusion' equates two expressions if they have the same sig-

nature and if in all algebras of that signature equipped with a default value for each sort, they

compute the same polynomial functions with the understanding that minimal incorrectly typed

expressions are replaced by the default corresponding to the target sort of their outermost

function symbol. Notice that the defaults are never declared in the signature of an expression.

These defaults are introduced in an external (meta) way. Thus, although the expressions seem

to be over algebras of their signature, they must be interpreted in algebras with some addi-

tional structure, viz. the defaults for each sort.

As all expressions have a signature and not just a type, their interpretation in the models

will take the form of a pair of a signature and a well-typed term over that signature (this term

may involve the value default and may use just a part of the signature).

Thus, if M is a many-sorted algebra with an interpretation of the constant, relation and

function symbols of our language, and if SIG is the class of signatures, then we are consid-

ering models of the form M x SIG, with as elements pairs of the form (a,~) (a an element of

a sort, cra signature). Function application looks as follows:

f ( ( a l , a l ) . . . . . (an, ~n)) = ( f (a l . . . . . an), ~1 + ... + ~n + f: t l x . . . x t n ---> t)

Notice that for all operations except substitution, the signature will increase.

7.3 FORMULAS

The initial algebra of S3 is an expansion of that of $2. This is proven in a similar way as in

the case of S2. The model construction will equate two formulas if their signatures coincide

and they have the same meaning in all algebras of that signature, where again we assume that

these algebras have been equipped with default values for each sort and the application of a

relation on a type incorrect parameter list leads to the default formula F for the appropriate

signature.

The meaning of a formula is determined as in ordinary many sorted logic. One may or

may not allow empty sorts, that will just lead to different variations of the theory. All sorts

that have an expression in the language are non-empty, as the default element has to be

identified.

392

7.4 MODULES

Models for $4 can be constructed as expansions of models for $3 similar to the construction

of models for BMA in [BHK 88]. The objects of type FOR play the role of first order atomic

modules and for each module a signature and a meaning is known. Models of module algebra

that can be considered are I(BMA[fo/]), M(fol), Mc(fol), T(fol) as defined in [BHK 88].

For 94 one can establish a normalization theorem that allows to rewrite every closed

module expression into a form that features only a single occurrence of the export operator.

7.5 DECLARATIONS

Models of $5 are again expansions of selected natural models of $4. Given a model of $4,

one may view a declaration environment as a system of definitions for signatures and mod-

ules. Two DE's can be identified if they define the same variety over the given models of

$4. Stated differently, a DE defines a relation with attribute names in GN and with as do-

mains for each attribute either the signatures or the modules of the chosen model. The relation

contains exactly those pairs that satisfy the identities of the declarations in the chosen model

of module algebra.

7.6 ENVIRONMENTS

The step towards $6 involves an extension of previous algebras rather than an expansion.

Indeed, the transition to $6 introduces for three sorts (BOOL, SIG, ASM) new expressions

involving declaration environments. It is claimed that models for $6 can be found by extend-

ing the appropriate models for $5 at these sorts with objects consisting of a pair of a declara-

tion environment and an original expression of that sort.

Then one may consider the closed signature and module expressions involving declara-

tion environments as mappings from the collections determined by their declarations to sig-

natures resp. modules. This construction is entirely general in the sense that it works on a

model of any theory in the place of module algebra.

7.7 AXIOMS

All axioms reflect valid assertions about the above interpretation of the language of BMASF.

In fact BMASF is a small extension of the system BMA[fo/] of [BHK 88]. It includes an ex-

plicit algebraic coding of first order logic. The only complication of that coding is the

application of defaults to deal with incorrectly typed expressions. Moreover it includes a

mechanism for the introduction of declarations and the expansion of declared names.

The axioms are to be judged on several criteria:

They must express facts about the operators that hold in the intended interpretation. They

must allow all transformations that are involved in normalization procedures that allow

rewriting expressions to the various normal forms that one may have in mind. Now it is

393

hardly possible to survey the normal forms and normalization procedures that are relevant.

Furthermore, as different tools may use different normalization mechanisms it is inappropri-

ate to design the axioms on their ability to allow just one normalization or just a few key nor-

malization mechanisms. The normalizations that are possible using the transformations de-

scribed in our axioms include:

(1) the disjunctiue and conjunctive normal form of propositional calculus;

(2) the prenex normal form of predicate logic;

(3) the normal form of module algebra for closed expressions without unexpanded im-

ports;

(4) collecting all declarations in a declaration environment (where declared modules have

no internal declai:ations);

(5) expanding all imports (for which a declaration is available) in a target expression;

(6) the flattening of module expressions to a normal form in the sense of module algebra;

this may be done just partially because of inexpandable imports, otherwise we are in case (3);

(7) the decomposition of complex predicate calculus formulae into a sum of atomic mod-

ule expressions using (P and O) = (P} + (O); this is exactly what happens if one writes a

long algebraic specification, though it may be inappropriate to call it normalization.

ACKNOWLEDGEMENT

The authors thank Hans Mulder (University of Nijmegen), Martin Kooij (PTT-RNL) and

Max Michel (CNET) for many helpful comments.

REFERENCES

[BB 89] J.C.M. BAETEN & J.A. BERGSTRA, Design of a specification language by abstract

syntax engineering, report CS-R8934, Centre for Math. & Comp. Sci., Amsterdam 1989.

[BHK 88] J.A. BERGS'IRA, J. HEERING & P. KLINT, Module algebra, report CS-R8844,

Centre for Math. & Comp. Sci., Amsterdam 1988 (revised version of report CS-R8617). To

appear in JACM.

[BHK 89] J.A. BERGSTRA, J. HEERING & P. KLINT, Algebraic specification, ACM Press

Frontier Series, Addison Wesley 1989.

[BR 89] LA. BERGSTRA & G.R. RENARDEL DE LAVALETTE, De plaats van formele

specificaties in de software-technologie, Informatie 31 (6), 1989, pp. 480-494.

[BI 87] R.S. BraD, An introduction to the theory of lists, in: Logic of Programming and Cal-

culi of discrete design (e& M. Broy), Springer 1987, pp. 5-42.

[FJKR 87] L.M.G. FEIJS, H.B.M. JONKERS, C.P.J. KOYMANS & G.R. RENARDEL DE

LAVALETTE, Formal definition of the design language COLD-K, METEOR/t7/PRLE/7,

1987.

394

[J 89] S.M.M. JOOSTEN, The use of functional programming in software development,

Ph.D. thesis, Universiteit Twente, 1989.

[MV 88] S. MAUW & G.J. VELTINK, A process specification formalism, report P8814, Pro-

gramming Research Group, University of Amsterdam 1988. To appear in Fund. Inf.

[M 86] L.G.L.T. MEERTENS, Algorithmics - towards programming as a mathematical activ-

ity, in: Math. & Comp. Sci. (eds. J.W. de Bakker e.a.), CWI Monograph 1, North Holland

1986, pp. 289-334.

[SPECS 89] SPECS Consortium, Definition of MR and CRL version 2.0, Deliverable

D.WP5.4, RACE project SPECS, 1989.

From an ERAE Requirements Specification

to a PLUSS Algebraic Specification:

A Case Study

A. Mauboussin 1 H. Perdrix

Laboratoires de Marcoussis, CR-CGE Route de Nozay, F-91460 Marcoussis, France

M. Bidoit M.-C. Gaudel

LRI, CNRS UA "AI Khowarizmi" B,~timent 490, Universit6 de Paris-Sud

F-91405 Orsay Cedex, France

J. Hagelstein

Philips Research Laboratory Brussels Avenue van Becelaere 2 Box 8

B-1170 Brussels, Belgium

Abstract

Formal specification languages and methods for refining specifications into programs have, up

to now, received more attention than methods for obtaining the initial formal specification.

This situation is corrected in the ESPRIT project METEOR, which distinguishes the two

activities of requirements engineering (RE) - obtaining the right specification - and design

engineering (DE) - using that specification properly. Because of their difference of nature,

these two activities gain from using different languages : RE languages should be closer to

natural language constructs, whereas DE languages should easily describe computer artifacts.

In particular, the RE language ERAE is based on temporal logic, whereas the DE language

PLUSS uses algebraic specifications, with emphasis on modularity and structuring concepts.

This paper investigates the transition between these two formalisms, which takes place when

the requirements specification is found satisfactory. As an example, we use the specification

of a transit node in a telephonic network.

1Consultant from LIF, Universit6 Pierre et Marie Curie, 4 place Jussieu, F-75252 Paris CEDEX 05.

396

1. INTRODUCTION

ERAE and PLUSS are two languages which were developed as part of the METEOR project. The former

is intended for the initial phase of problem clarification, whereas the latter addresses the expression of

specifications and their refinement. The work reported here uses these two languages in combination

on a common case study, thereby simulating a large part of normal system development, including the

transition between the two main phases.

ERAE is a method for requirements engineering, i.e. for the identification of those aspects of a computer

system which are relevant to the user. The method is based on the following main assumptions:

• requirements should not first focus on the computer system, but rather on a more general system

formed by the computer and its environment;

• requirements should be unambiguous;

• the validation of requirements may benefit from powerful analysis techniques, such as the identifi-

cation of consequences of a set of requirements.

The ERAE language supports these assumptions by providing concepts suited for modelling real-world

situations, rather than just computer system behaviours. It is also a formally defined language, hence

unambiguous, and supports rigorous deduction rules.

PLUSS (Proposition of a Language Usable for Structured Specifications) is a language for the development

of algebraic specifications, with emphasis on modularity and structuring concepts. PLUSS is the result

of a broad range of experiments in writing large algebraic specifications: part of a telephone switching

system, a subset of the Unix file system [BGM 87], etc. The main characteristics of the PLUSS language

are the following:

The PLUSS language provides a way of structuring algebraic specifications, i.e. any kind of speci-

fications for which a formal semantics can be given by means of a signature and a class of algebras.

To some extent, PLUSS is a recta-specification language, since the structuring features are not (or

little) dependent on the kind of algebras under consideration.

The PLUSS specification language follows the loose semantics approach, which assigns a whole

class of models m a given specification. This reflects the fact that a specification language aims at

describing classes of possible implementations.

The formal semantics of the PLUSS language is defined in [Bid 89]. Another important feature of PLUSS

is the distinction between completed specification components (those that are ready to be implemented)

and specification components under development.

The paper is organised as follows. Section 2 provides the informal description of the Transit Node case

study. Section 3 and 4 present an overview of the ERAE language, and its application to the specification

397

of the Transit Node. Section 5 to 7 provide an overview the PLUSS language, discuss the transition

between ERAE and PLUSS, and present the PLUSS specification of the Transit Node.

2. THE TRANSIT NODE CASE STUDY

This case study was defined in the RACE project 2039 (SPECS: Specification Environment for Commu-

nication Software). It consists of a simple transit node where messages arrive, are routed, and leave the

node.

The informal specification reads as follows :

"The system to be specified consists of a transit node with:

• 1 Control Port-In

• 1 Control Port-Out

• N Data Ports-In

o N Data Ports-Out

• M Routes Through

(The limits of N and M are not specified.)

Each port is serialized. All ports are concurrent to all others. The ports should be specified as separate,

concurrent entities. Messages arrive from the environment only when a Port-In is able to treat them.

The node is "fair". All messages are equally likely to be treated, when a selection must be made, and all

messages will eventually transit the node, or be placed in the collection of faulty messages.

Initial State: 1 Control Port-In, 1 Control Port-Out.

The Control Port-In accepts and treats the following three messages:

• Add-Data-Port-ln-&-Out(n)

gives the node knowledge of a newport-in(n) and a newport-out(n). The node commences to accept

and treat messages sent to the port-in, as indicated below on Data Port-ln.

• Add-Route((m),(n(i),n(j)...))

gives the node knowledge of a route associating route m with Data Port-Out(n(i),n(j),...).

• Send-Faults

routes all saved faulty messages, if any, to Control Port-Out. The order in which the faulty messages

are transmitted is not specified.

398

A Data Port-In accepts and treats only messages of the type :

• Route(m).Data

The Port-in routes the message, unchanged, to any one (non-determinate) of the Data Ports-Out

associated with route m. (Note that a Data Port-Out is serialized - the message has to be buffered

until the Data Port-Out can process it). The message becomes a faulty message if its transit time

through the node (from initial receipt by a Data Port-In to transmission by a Data Port-Out) is

greater than a constant time T.

Data Ports-Out and Control Port-Out accept messages of any type and will transmit the message out of

the node. Messages may leave the node in any order.

All faulty messages are saved until a Send-Faults command message causes them to be routed to Control

Port-Out. Faulty messages are messages on the Control Port-In that are not one of the three commands

listed, messages on a Data Port-in that indicate an unknown route, or messages whose transit time through

the node is greater than T. Messages that exceed the transit time of T become faulty as soon as the time

T is exceeded. It is permissible for a faulty message to not be routed to Control Port-Out (because, for

example, it has just become faulty, but has not yet been placed in a faulty message collection), but all

faulty messages must eventually be sent to Control Port-Out with a succession of Send-Faults commands.

It may be assumed that a source of time (time-of-day or a signal each time interval) is available in the

environment and need not be modeled with the specification."

3. A N O V E R V ~ W O F T H E E R A E R E Q U I R E M E N T S L A N G U A G E

We give here a very brief introduction to the ERAE language. A more detailed informal presentation can

be found in [DHR 88] and the formal definition of the language is given in [Hag 89].

ERAE is a variant of temporal logic [RU 71]. It describes a system as a sequence of states, indexed by

an increasing time value. A state is an algebra, i.e. a (fixed) collection of sorts as well as predicates

and partial functions defined on them. The contents of sorts and the value of predicates and functions

may vary from state to state, but, in any state, a predicate only holds and a function is only defined if

their arguments are present in some sort. The value of a defined function must belong to some sort in

the state.

An ERAE specification contains two parts : declarations, which identify the sorts, functions, and predi-

cates forming the states, and statements, which further constrain the possible sequences of states.

3.1. Declarations

Declarations may be expressed graphically in ERAE, as done in Figure 1 for the Transit Node case study.

(See next section.) In such a graph, the nodes denote sorts and the lines denote predicates and functior,,s.

399

The declared sorts fall in three categories : event sorts are denoted by ovals (e.g. Data-msgs) and model

instantaneous phenomena; entity sorts are denotedby rectangles (e.g. Data-ports-in) and model objects

with some life time; value sort are denoted by a plain name at the end of a line (e.g. Data) and are meant

to denote abstract, time-independent objects, like integers.

The three kinds of sorts have different properties. For event sorts, there is a predefmed predicate 'occurs'

which holds in exactly one state for each event. For entities, the predicate 'exists' holds while its

argument is in some sort. Value sorts have a constant contents and most of their functions and predicates

are constant. The most common of these sorts are predefined, in particular, '[1..M]' which is the subsort

of Natural including the values between t and M.

ERAE makes also the distinction between usual sorts and those with just one object inside, called

'individuals'. This paper only uses individual entities, denoted by a rectangle with a double outline

(e.g. Control-port-in). A constant, with the same name than the sort, is also available to denote the

object.

The declared predicates correspond to the lines in Figure I, and their type is derived from the connected

sorts. For instance, the predicate 'port ' is declared of type 'Data-msgs x Data-ports-in'. The arrowed

lines relate entity or event sorts and are called 'relations'. The other lines relate an event or entity sort

to a value sort (in that order) and are called 'attributes'. If a line is connected to more than one sort at

an end (e.g. 'port ' and 'sending'), it applies to the union of these sorts.

Figure 1 conveys additional information in the form of dashed or plain lines, and numbers at the end

of lines. Dashed lines are used to denote predicates/functions which may vary freely from state to

state. If a plain line is used instead, the predicate/function may not change value during all states where

its arguments are present. Dashed and plain lines may also be used for sorts: dashed boxes have a

state-dependent membership, and plain boxes have an invariant one.

The numbers at the end of the associations are cardinality constraints. A symbol '± ' at the end of a line

means that, in every state, each object at that end participates in exactly 5. occurrences of the association.

When a range of values needs be specified, the notation '5.: j ' is used. It means that the objects at that

end participate to at least i and at most j occurrences. The absence of information is equivalent to

'0 : ~ ' . A line with ' 1' or '0:1' at the origin denotes both a predicate and a function with the same name.

3.2. Statements

The statements are expressed multi-sorted temporal logic. The atomic statements are predicate applica-

tions, equality between terms, membership of a term in a sort (using the infix operator 'E ' ) and definedness

of a term (using the postfix operator q ' , the only non strict operator).

Other statements may be formed using the usual constructs of first-order logic and the following temporal

logic constructs (there are some more in ERAE, but not used here) :

400

o~ o~

o~

~b holds in the next state

¢ holds in the previous state

~b holds either in the current state or in some future one

~b holds in the current state or has held in some previous one

~b holds in the current state and in all future ones

~b holds in the current state and has held in all previous ones

A statement is verified by a sequence of states if it is true in each of these states. The truth of a statement

in a state is defined as usual for first order logic statements and as stated above for the temporal ones.

It should be noted that the quantification of a variable ranges over all objects that will ever belong to its

sort in any state of the sequence.

The versatility of this logic is amply illustrated in the next section. Here are just a few examples (free

variables are universally quantified) :

"A data arrival da must be followed later by some corresponding data sending ds."

occurs(da) =~ 3ds : corresponds(da,ds) ^ Ooccurs(ds) ;

"When an add port message apm occurs, there is no input data port dpi with the corresponding

number."

occurs(apm) ~ -~3 dpi: • in-port-nr(dpi)=port-ur(apm) ;

Notice that 'occurs" holds just after the event occurred, and that "• ' is needed to refer to the

state just before.

ERAE further extends temporal logic for the expression of real-time properties. This is done by adding

a limit of duration to the temporal operator, as in the examples below:

0=3, ~b

~1'>5- ¢

• <3h

means that ¢ will be true in some future state, exactly 3

minutes later than the current one

means that ~b was true in some past state, more than

five seconds from the current one

means that ~b has been true in all past states, up to 3 hours

from the current one

4. T H E E R A E S P E C I F I C A T I O N O F T H E T R A N S I T N O D E

Although the methodology followed during the requirements analysis is not the theme of this paper, we

will briefly comment on the approach we followed. The informal specification of the Transit Node is a

rather typical starting point, as it exhibits usual deficiencies : lack of structure, ambiguity, incompleteness.

Therefore, a key component of the ERAE methodology is a set of guidelines for approaching an obscure

401

problem. The overall approach used in this paper consists in first investigating the declarations, and then

systematically providing the statements corresponding to :

• the initial state,

• the sufficient and necessary preconditions for any change of interest, namely

- occurrence of events,

- appearance and disappearance of entities,

- change in relations and attributes.

More detailed guidelines are available for each of these steps. It should be noted that this approach is

just one out of several alternatives in the ERAE methodology.

This approach exhibited numerous unclarities in the informal description. We will only mention two

examples :

• several possible faulty messages are not mentioned : adding of a route referring to unknown ports,

adding of empty routes, adding of existing ports, etc. It is unclear whether these may not occur,

or are handled (but then, what is their effect ?).

• even the normal functioning of the node is not quite clear. An incoming data message should

be routed to one of the output ports associated with the specified route, but associated at what

moment ? When the message arrives ? When it leaves ? At some point in between ?

For the first question, we assumed that unforeseen situations are guaranteed never to occur. The reader

is kindly invited to find out, in the ERAE statements, our decision about the second question.

4.1. Declarations

The ERAE declarations for the Transit Node case are given in Figure 1. They are complemented by

textual declarations for value sorts, variables, and auxiliary functions or predicates.

The value sort 'Data' is the only one which is not predefined. It is simply specified as

value sort Data ;

This sort will be disjoint from all other and have no operations, except equality, membership of a term

in the sort, and definedness of a term.

In this application, a few value constants are needed (Time and Natural are predefined value sorts) :

402

[Z:M] [1:11]

~ o u t e I / o u t n = \ / p ° ~ - n =

1 11 -

port

, I i , ~tOtltelll , ' D a t a °, , : p o d s - o u t

= / o~e 1 1 ' - . . . . . . . . : = /o:a / | o: I

[I:M] po=t: / [1:111

=out

[1:11] Boolean Data iJoolean [1:ii]

1 ~ f a u l t y a u l t ¥ • nE

d a t a \ /

D, a

1

l~ 'a ' t

. . . . . . . . . .

[ I :M] D a t a

n= poz~

0:1 [1 :ll]

1

=outs/ \ data °=/ \ [I :M] D a t a

Figure 1. Graphical declaration for the Transit Node

func T: ---, Time ;

func M,N: --.r Natural ;

The fol lowing variable declarations have the whole specification as scope.

v a r arm: Add-route-msgs ;

va r apm: Add-port-msgs ;

va r sfm: Send-faults-msgs ;

v a r wm: Wrong-msgs ;

v a t dm: Data-msgs ;

va r mr: [I :M] ;

va r dins: Data-msg-sendings ;

v a r fins: Faulty-msg-sendings ;

v a r r: Routes ;

va r dpi: Data-ports-in ;

va r dpo: Data-ports-out ;

va r pnr: [I:N] ;

The auxiliary function 'arr ival ' is defined to be the opposite of ' s end ing ' :

403

fune arrival: Data-msg-sendings t.J Faulty-msg-sendings

Wrong-msgs O Data-msgs ;

var dfm : Data-msg-sendings tv Faulty-msg-sendings ;

var wdm: Wrong-msgs 0 Data-msgs ;

arrival(dfm)=wdm ~ sending(wdm)--dfm ;

4.2. Initial state

The predicate 'initially' holds at any time before the first event.

pred initially ;

va t ev: Event ;

initially ¢~ -,3 ev: Ooceurs(ev)

Such an initial state must hace occurred somewhere in the past.

Oinitially ;

There is initially no port, no route, no faulty-msg. (The sorts 'Event ' is

predefined as the union of all event sorts).

initially =V -, 3 dpi: dpi E Data-ports-in

^ -, 3 dpo: dpo E Data-ports-out

A -, 3 r: r E Routes

^ -. 3 wm,dm: faulty(wm) v faulty(dm) ;

4.3. Preconditions on events

The necessary and sufficient preconditions usually take one of the two following forms :

p h e n o m e n o n . <past operator> necessary precondition

sufficient p r e c o n d i t i o n . <future operator> phenomenon

The <future operator> will often be absent, thanks to the fact that the occurrence of an event takes place,

by convention, between the state where 'occurs' is true and the previous state. One of these conditions

or both may be lacking, for events which are free to occur.

Add-route-msgs:

Necessary condition: all output ports mentioned in an add-route-message are allocated.

occurs(arm) A out-port-nr(arm,pur) ~ 3 dpo: • n r ( d p o ) - ~ ;

No sufficient condition.

404

Send-fault-msgs :

Unconstrained occurrence.

Add-port-msgs :

Necessary condition : when an add-port-message occurs, its port-nr is not yet allocated.

occurs(apm) ~ • -~ 3 dpi: nr(dpi)--port-nr(apm) ;

occurs(apm) ~ • -~ 3 dpo: nr(dpo)=port-nr(apm) ;


Wrong-msgs :

Unconstrained occurrence.

The relation "sending" is not constrained here because the wrong-msg is the earliest of the

two events participating in such a relation. The time-dependent attribute 'faulty' is handled

later as a state change.

Data-msgs :

Necessary condition: messages arrive only when an input port is able to treat them.

occurs(dm) =~ • exists(port(dm)) ;


The value of 'route-nr' is not required to correspond to an existing route. 'Faulty' is handled

later.

Data-msg-sendings :

Necessary condition : a Data-msg-sendings only occurs if the corresponding arrival occurred

before and is not currently faulty, and if its output port exists.

occurs(dins) =~ -~ • faulty(arrival(dms))

^ ~occurs(arrivaI(dms))

A • exists(port(dms)) ;

Sufficient condition: Data-msgs must become faulty or be sent some time, as Data-msg-

sendings.

405

occurs(din) =~ ~ faulty(din)

V <> (occurs(sending(din)) A sending(din) E Data-msg-sendings) ;

Relations and attributes:

The arrival of a Data-msg-sending may only be a Data-msg.

arrival(dins) E Data-msgs ;

A Data-msg-sendings occurs on one of the ports associated with the route given in the

corresponding Data-msgs (when it arrived !!).

occurs(arrival(dms)) =~ • 3 r : ur(r)=route-rtr(arrival(dms)) A out(r,port(dms)) ;

The route-nr and data attributes have the value of the corresponding input message.

route-nr(dms)=route-nr(arrival(dms)) A data(dms)=data(arrival(dms)) ;

Faulty-msg-sendings :

Necessary condition : a Faulty-msg-sendings occurs only if the corresponding message was

fauky when a previous Send-faults-msgs occurred.

oceurs(fms) =~ 3 sfm: 0>0 [oceurs(sfm) A • faulty(arrival(fms))] ;

Sufficient condition: provided send-faults occur regularly, faulty messages eventually get

sent. (This is a fairness assumption.)

var wdm : Wrong-msgs u Data-msgs ;

faulty(wdm) A (0 <> 3sfm: occurs(sfm))

=~ <> (occurs(sending(wdm)) A sending(wdm) E Faulty-msg-sendings) ;

Relations and attributes:

The route-nr and data attributes have the value of the corresponding message (undefined if

it was in Wrong-msgs).

arrival(fms)--dm =~ route-nr(fms)--route-nr(dm)

A data(fms)-data(dm) ;

arrival(fms)=wm ~ -~ route-nr(fms) ! ;

A -~ data(fms) ! ;

Remember that the post-fix operator ' ! ' asserts the definedness of a term.

406

The characterisation o f events' occurrences in terms of prior phenomena does not work when the con-

straints are on simultaneous phenomena. In that case, the situation is symmetric, and there is no reason

for attaching the constraint to one or the other event. Therefore the simultaneity conditions are considered

separately. In this case study, we have the following statement:

There is at most one message at a time on a given port. (The informal description reads

"each port is serialised".)

var msg~,msg2: Add-port-msg to Add-route-msg u Send-faults-msg

U Wrong-msg U Data-msg U Data-msg-sendings

U Faulty-msg-sendings ;

-,3 msgl,msg2 : msgl~msg2 A occurs(msgl) A occurs(msgz)

A • port(msg0=port(msg2) ;

These constraints include those on the occurrence time of events, on their invariant attributes and on their

relations. The latter are described in connection with the event which is the latest of the ones participating

in the relation. Variable properties of events are considered among the state changes. We characterise

those aspects in terms of properties preceding the occurrence of the event.

4.4. Precondit ions on state changes

These are constraints on changes in predicates/functions other than 'occurs'. The necessary and sufficient

preconditions take the same form as in the previous section.

In this case study, the predicates ' E ' and 'exists' have the same preconditions for all entities, because all

sorts are disjoint. (We have not even explain how you could have overlapping sorts in ERAE.)

Routes:

Nec, and surf. condition for birth: a non-existing route is created and its number is set by

an add-route-msg with the proper nr.

exists(r) A • -~ exists(r)

=~ 3 arm: [occurs(arm) A route-nr(arm)=nr(r)] ;

occurs(ann) A • - ~ 3 r : nr(r)=route-nr(arrn)

=> 3 r : nr(r)=route-nr(arm) ;

Nec. and suff. condition for death: routes never disappear.

e x i s t s ( r ) , o exists(r) ;

407

Dam-ports-in:

Nec. and surf. condition for birth: a dpi with the fight number is created by the corresponding add-port-msg.

exists(dpi) ^ • -~ exists(dpi)

=~ 3 apm: occurs(apm) ^ nr(dpi)=port-nr(apm) ;

occurs(apm) ~ 3 dpi : • -, exists(dpi) ^ nr(dpi)--port-nr(apm) ;

Nec. and surf. condition for death: ports never disappear.

exists(dpi) =~ [] exists(dpi) ;

Data-ports-out:

Necessary and sufficient condition: an add-port-msg creates the corresponding dpo and dpi

with the fight number.

exists(dim) A • -, exists(dim)

=~ 3 apm: occurs(apm) ^ nr(dpo)=port-nr(apm) ;

occurs(apm) =~ 3 dim: • -- exists(dim) ^ nr(dim)---port-ur(apm) ;

Nec. and surf. condition for death: ports never disappear.

exists(dim) ~ r7 exists(dim) ;

faulty(win) :

Nec. and suf~. condition for setting: wrong messages are set faulty when they occur.

(faulty(wm) ^ -, • fautty(wm)) ¢~ occurs(win) ;

Nec. and surf. condition for resetting : they are reset when sent out.

(-, faulty(win) ^ • faulty(wm)) ¢~ occurs(sending(win)) ;

faulty(dm) :

Nec. and surf. condition for setting: data-msgs become faulty under two circumstances : too

long transmission delay (a data message occurred T times ago, and its output did not occur

in the meantime) and unknown route.

408

faulty(dm) A -~ • faulty(dm)

((O_--r occurs(din)) A I1_<~, -~ occurs(sending(din)))

v (occurs(din)/x • -~3r: nr(r)--route-nr(dm)) ;

Nec. and surf. condition for resetting: the 'faulty' attribute of data-msgs is reset when they

are sent out.

(o faulty(din)) A -1 faulty(din) ~ occurs(sending(rim)) ;

occurs(sending(din)) A • faulty(din) ~ -~ faulty(din) ;

out(r,dpo) :

Nec. and surf. condition for setting: out(r,dpo) is set, if not yet true, when an add-route-msgs

requires it.

out(r,dpo) A -~ • out(r,dpo)

3 arm: occurs(arm) A route-nr(ann)=nr(r)

A out-port-nr(arm,nr(dpo)) ;

For expressing the sufficient condition, we will use the auxiliary predicate ' ou t l ' , which

holds between a route number and a port number if and only if 'out ' holds between the

corresponding route and the corresponding port.

pred outl : [I:M] x [I:N] ;

outl(rnr,pnr) ~ 3 r,dpo: [out(r, dpo) A nr(r)=rnr A nr(dpo)---pnr] ;

occurs(arm) ^ • (out-port-nr(arm,prtr)

A route-nr(arm)--mr A -~ outl(rnr,pnr)) =~ outl(rur,pnr) ;

Nec. and surf. condition for resetting : out(r,dpo) is reset by an add-route-msg not requiring

that connection.

-~out(r,dpo) ^ • out(r,dpo)

3 arm: occurs(arm) ^

• (route-nr(arm)=nr(r) A -~ out-port-nr(arm,nr(dpo))) ;

occurs(arm) A • (route-nr(arm)=nr(r)

A -~ out-port-nr(arm,nr(dpo)) A out(r,dpo)) =~ -~ out(r,dpo) ;

409

5. A N O V E R V I E W O F T H E P L U S S S P E C I F I C A T I O N L A N G U A G E

In this section, we only briefly recall the main features of the PLUSS algebraic specification language that

are used in the Transit Node case study (see e.g. [Gau 85, BGM 87, Bid 89] for more comprehensive

descriptions of PLUSS).

5.1. The spec and use constructs

Standard specification components are introduced by the keyword spec. A spec is characterized by the

other used specs and introduces sorts together with their generators, operations and predicates,

preconditions and axioms to describe the required properties for the operations and predicates. Thus, the

use construct is a means for incrementally adding new features (sorts, operations, predicates) to already

existing specifications. It is used to put specifications together and to develop specifications step by step. In

the Transit Node case study, the TRANSIT-NODE specification is a typical example of a spec built on top

of another one, the TRANSIT-NODE-KERNEL spee (which is itself based on other specs, see

Section 7):

spee TRANSIT-NODE

use TRANSIT-NODE-KERNEL

sort Tn

generators

ink : ->'In cm-arrival : Tn x ControlMsg -> Tn din-arrival : Tn x PortNb x DataMsg -> Tn

operations

effect : ' In-> Tnk

preconditions

dm-arrival(tn, pn, dmsg) is defined when pn belongs to data-poI~s(effect(tn))

axioms

end TRANSIT-NODE

By convention, using a spec must not change the class of its models. This fundamental property will be

referred to as hierarchical constraints, and corresponds to the fact that, in order to be able to write modular

specifications, it is necessary to abstract from the various possible implementations of the used

specifications (hence from their models). For instance, using the TRANSIT-NODE-KERNEL spec in the

TRANSIT-NODE spec should not lead to introduce new values in a sort which is defined elsewhere (e.g.

the sorts Tnk or PortNb) or to identify some values. Thus, when working with specs, new values can be

specified only when new sorts are declared. For each new sort some generators must be given. All the

values of the corresponding set are denotable as some composition of the generators. Formally speaking, it

means that the models associated with a spec are finitely generated models with respect to the generators of

the new sorts.

410

By default, the use specification-building primitive is transparent, for instance all the sorts and operations

(e.g. Tnk, PortNb, belongs to, etc.) that are visible from TRANSIT-NODE-KERNEL are visible from

TRANSIT-NODE. Visibility can explicitly be controlled by means of the export and forget primitives of

PLUSS when necessary.

5.2. Parameterization and instantiation

Parameterization allows the use of generic specifications, hence saves writing as many specifications as

instances of a given specification are required. For instance, writing a parameterized specification of

SEQUENCE would save writing various specifications such as SEQ-OF-INT, SEQ-OF-CHAR, etc.

Parameterization involves three different entities: a parameterized specification (introduced by the keyword

proe), some formal parameter specifications (introduced by the keyword param), and an instantiation

mechanism (as construct). Parameterized specifications share most syntactical aspects with ordinary

specifications, e.g. they can use other specifications, they must include a declaration of the generators of

the new sorts introduced, etc.

In the Transit Node case study, we make an extensive use of the SET-OF parameterized specification. This

specification is partially displayed below (see the Appendix for the full tex0 and will be instantiated in

various ways to obtain specifications such as SET-OF-PORT, SET-OF-ROUTE, etc. (see the Appendix).

proc SET-OF (ITEM)

sort Set

° , ,

end SET-OF

The SET-OF specification is parameterized by the ITEM formal parameter specification. The aim of a

formal parameter specification is to describe the minimal requirements that actual parameter specifications

must fulfill in order to be considered as appropriate parameters for instantiation. In our case, there is no

specific requirements. Thus, the formal parameter specification ITEM just introduces a new sort Item:

param ITEM

sort Item

end ITEM

It should be noted that formal parameter specifications, described separately from the parameterized

specification, are not linked to a specific parameterized specification and can therefore be reused in other

ones.

A specification such as SET-OF-ROUTE can then be obtained by simply intantiating the SET-OF proe

using the ROUTE spec as an actual parameter for the ITEM param. Basically, instantiation is no more

than the substitution of the actual parameter specifications to the formal parameter ones in the parameterized

specification. This "substitution" is specified by a parameter passing mechanism called "fitting morphism".

This fitting morphism explains in which way the sorts and operations of the actual parameter specification

411

(here, the sort Item) correspond to the sorts and operations of the formal parameter specification (here, the

sort RouteNb). Most of time no ambiguity arises and the fitting morphism is left implicit, as is the case

here. Once parameter passing is achieved, it is often convenient to rename sorts and/or operations following

one's own conventions or wishes. For instance the sort Set will be renamed into the sort Set-of-RouteNb:

spec SET-OF-ROUTE as SET-OF (ROUTE)

renaming Set into Set-of-RouteNb

end SET-OF-ROUTE

It should be noted that renaming is a general feature of the PLUSS specification language and its use is not

limited to specifications obtained as an instance of a parameterized specification. Renaming can be

combined with any of the PLUSS constructs.

6. F R O M E R A E T O P L U S S

The PLUSS specification has been written from the ERAE specification. We have used the informal

specification, and the remarks given in parts 1 and 3 on some incompletness detected when writing the

ERAE specification. However, we have considered them as comments coming with the ERAE

specification.

We intended to be as systematic as possible in the derivation of a PLUSS specification from an ERAE

specification. We were rather successful, as it can be seen below. The performance of this derivation was

rather easy, and some general guidelines have been stated. As anticipated, the difficult point came from the

"time-out" requirement: "a message present for a time greater than T in the Transit Node becomes faulty".

ERAE is based on temporal logic; there is no explicit notion of time in PLUSS. It is not a surprise that the

two specifications are slightly different (but consistent from an observational point of view) on this point.

We come back later to this point.

The PLUSS specification modules are organized as follows:

~ use

(SET-OF-PORT) ~ET-OF-ROUTE)

I use

(TRANSIT-NODE-KERNEL)

use use use use

OO.OEOT,

412

The various SET-OF... specifications are instantiations of the same generic specification SET-OF(ITEM).

The main module is TRANSIT-NODE-KERNEL. Its structure is very close to the ERAE specification. In

this module, the eight generators of the Tnk sort (Tnk is for Transit node kernel) correspond exactly to the

seven event sorts of the ERAE specification, plus the initial state. Intuitively speaking, given an event sort

Es in ERAE, it will become in PLUSS a set of closed terms with the generator of name Es as leading

symbol. This set of terms may be restricted by some preconditions, which correspond to the preconditions

on events in ERAE.

In ERAE, the distinction between various kinds of input is sometimes implicit, or, more precisely,

described via distinct event sorts. It is the case here for the control messages. This distinction has been

introduced in PLUSS in the TRANSIT-NODE module, which uses TRANSIT-NODE-KERNEL, and

describes the distinctions between the various kinds of control messages and specifies their effect in term of

the generators mentioned above. For the data messages the distinction is described in ERAE by means of

the arrival function. The same thing is done in TRANSIT-NODE. Thus the correspondence between ERAE

event sorts and PLUSS sort generators seems rather straightforward, at least in this case study.

Changes in state predicate/functions in the ERAE specification are specified in PLUSS as results of

observers operations on the results of the generators mentioned above. There is no need of an explicit

notion of state in PLUSS: it is easy to characterize a state in an observational way, via some observers

which formalize all the possible observations on the specified system.

There are additional observers in the PLUSS specification: they describe the effects of the "output events"

of the ERAE specification.

Some of the sufficient conditions of the ERAE specification are useless because of the property of algebraic

specifications: an operation needs its arguments to be computed. Some of the sufficient conditions are

transformed into axioms. One of them is not completely expressible in an algebraic framework: it is the

fairness property. The transcription of this property in the PLUSS specification is related to three points:

- The choose operation in the SET-OF-MSG specification must be fair; this can be ensured by

representing a set by a queue.

- The choice of a "data-port-out" must also be fair; this could be specified as an explicit choice of the

less used port.

Moreover, fairness is a property related to infinite processes. In algebraic specification languages, such as

PLUSS, only finite terms are considered. It could be said, following [Dij 88] (see also [SL 88]) that

fairness is not a relevant problem in this framework: to any term, i.e. sequence of generators, one of the

operations send-fault-msgs,faulty-msg-sending or data-msg-sending can always be applied. We consider

that this claim is not completely "fair". Another possibility is to consider in the semantics of PLUSS

continuous algebras where infinite terms are introduced as limits of series of finite terms [TW 86]; then

fairness requires to consider only those limits where send-fault-msgs, faulty-msg-sending and data-msg-

sending occur "suffieiently often". It is elear that this last point needs to be explored and discussed further.

413

7. T H E P L U S S S P E C I F I C A T I O N O F T H E T R A N S I T N O D E

7.1. The overall organization of the specification

As shown on the previous figure, the Transit Node specification is structured into two main levels that are

detailed in the next sections:

- the specification of the TRANSIT-NODE,

- the specification of the TRANSIT-NODE-KERNEL.

The TRANSIT-NODE module specifies the input interface of the Transit Node, i.e. the acceptance of the

incoming correct or incorrect messages (at this level incorrect messages are control-messages that have a

name different of send-fault, add-route or add-port). To each message arrival corresponds a new

TRANSIT-NODE.KERNEL (denoted by a Tnk term), which is the result of a dm-arrival operation or of

the cm-arrival operation.

The TRANSIT-NODE-KERNEL module specifies the different states of the Transit Node, i.e. the

evolution of the Transit Node.

The ERAE specification does not give any details on basic objects such as messages, names, time. So, in

the PLUSS specification, we say nothing more and we assume that some basic specifications are

predefined. For these specifications (see e.g. TIME, MSG-NAME, DATA-MSG) only the sort names are

given.

The ERAE specification introduces constants (the bounded time T, the maximal number of ports N and the

maximal number of routes M), the values of which are not given. In the PLUSS specification, the constants

are introduced in the same way. We make use of the hierarchical constraints associated to the use construct

of PLUSS to write "incomplete" specifications.

Other basic specifications are: PORT, ROUTE, TIME, MSG, DATA-MSG, CONTROL-MSG, MSG-

NAME. The specification MSG uses DATA-MSG and CONTROL-MSG and has only coercions as

operations, meaning that a message is either a control message or a data message. Only some operations of

the CONTROL-MSG specification are given: especially the constants defining the correct control messages

(add-route, add-port and send-fault) and the observers giving the various components of a correct control

message (name-of, route-nb, port-nb and set-of-port-nb). This specification is used only in order to take

into account all the kinds of messages that can occur.

The rest of the specification modules are completely defined: the SET-OF... modules are instantiations of

the SET-OF(ITEM) module.

All these specifications are given in the Appendix.

414

contr

contr

~ort-out

ty-msg-sending

data- t -ou t

t-msg-sending

da

Figure 2. Data Flow Presentation of the Transit Node

7.2. The T R A N S I T . N O D E module

The TRANSIT-NODE module specifies the treatment of all the incoming messages. To each message

arrival corresponds a new TRANSIT-NODE-KERNEL, given by the operator effect. The effect of the

arrival of a correct control message (send-fault, add-route or add-port) is the activation of the corresponding

generator (or event in the ERAE terminology) of the TRANSIT-NODE-KERNEL. The effect of the arrival

of an incorrect control message (its name is not one of the three above) is the activation of the wrong-msgs

generator. The effect of the arrival of a data message is the activation of the data-msgs generator.

spec TRANSIT-NODE


sort Tn

generators

init : -> Tn

crn-arrival : Tn x ControlMsg -> "In

dm-arrival : Tn x PortNb x DataMsg -> Tn

operations

effect : "In -> 'Ink

415

preconditions din-arrival(m, pn, dmsg) is defined when tm belongs to data-ports(effect(m))

axioms effect(iniO = initial

"correct control messages"

name of cmsg = add-route => effect(era .arrival(tn,cmsg)) =

add-route-msgs(effect(tn), route-nb(cmsg), set-of-port-nb(cmsg))

name ofcmsg = send-fault => effect(cm-arrival(tn,cmsg)) = send-fault-msgs(effect(m))

name of emsg = add-port => effect(em-arrival(tn,cmsg)) = add-port-msgs(effect(m), port-nb(cmsg))

"incorrect control messages"

name of emsg = add-mute is false &

name of cmsg = send-fault is false &

name of cmsg = add-port is false => effeet(cm-arrival(m,cmsg)) = wrong-msgs(effect(m), cmsg)

"data messages"

effect(dm-arrival(tn, pn, dmsg)) = data-msgs(effect(tn), pn, dmsg)

where

tn : Tn, pn : PortNb, cmsg : ControlMsg, dmsg : DataMsg

end TRANSIT-NODE

The port number appears explicitely as operand of the dm-arrival operation. We will see in Section 8 that

this will be useful when the specification is extended to take into account the fact that the data ports work in

parallel.

7.3. The T R A N S I T - N O D E - K E R N E L modu le

The TRANSIT-NODE-KERNEL module specifies the different states of th~ Transit Node. Most of the

generators correspond to an event, the exceptions are initial, which defines the initial state of the Transit

Node, and idle, which expresses that time is running and messages are becoming older even when nothing

happens. The defined operations describe the effect of the various events on the Transit Node, mainly on

the sets of existing ports and routes and on the sets of messages: faulty or waiting to be sent.

The data-messages are buffered in the message set attached to one of the data-port-out, associated with the

route they must follow, until the data-port-out can process them. The faulty messages are also buffered,

either in the faulty-message buffer (wrong messages and timed-out messages) or in the control-port-out

buffer (after the occurrence of the send-fault-msgs event), waiting to be sent out of the Transit Node.

Therefore the TRANSIT-NODE-KERNEL spec uses the specs SET-OF PORT, SET-OF-ROUTE, SET-

OF-MSG, SET-OF-DATA-MSG - - a l l are instantiations of the p roc SET-OF(ITEM)--- and the spec

BOUNDED-TIME.

416

W e g ive be low the c o m m e n t e d s i gna t u r e o f the T R A N S I T - N O D E - K E R N E L s p e c fo l l owed by a

presentat ion o f the ax ioms .

Commented signature of TRANSIT-NODE.KERNEL

spec TRANSIT-NODE-KERNEL

use SET-OF-PORT, SET-OF-ROUTE, SET-OF-MSG, SET-OF-DATA-MSG, BOUNDED-TIME

sort Tnk

generators

"initial defines the initial state o f the Transit Node."

~tial : -> Tnk

"add-route.msgs creates a route number and associates it with a set of data-pOrt-out numbers."

add-mute-msgs : Tnk x RouteNb x Set-of-PortNb -> Tnk

"send-fault-msgs adds faulty messages to the control-port-out message set."

send-fault-msgs : Tnk -> Tok

"add-port.msgs creates a new port-in and the corresponding port-out."

add-port-msgs : Tnk x PortNb -> Tnk

"wrong.msgs corresponds to the reception of an incorrect control message on the control-port-in "

wrong-msgs : Tnk x ControlMsg -> Tnk

"data-msgs corresponds to the reception of a data-message on a data port-in, this message must follow the given

route (a part o f the data-message), that is it must be sent via one of the data-port-outs associated with this route."

data-msgs : Tnk x PortNb x DataMsg -> Tnk

"data-msg.sending sends out o f the Transit Node one of the waiting messages i f it is not timed out."

data-msg-sending : Tnk x PortNb -> Tnk

"faulty-msg.sending sends out o f the Transit Node one of the faulty messages waiting in the control-port-out."

faulty-msg-sending : Tnk -> Tnk

"idle is a void action, however the time progresses"

idle : Tnk -> Tnk

operations

"data.ports gives the set o f existing ports (in and out)."

data-ports : Tnk -> Set-of-PortNb

"routes gives the set o f existing routes."

mutes : Tnk -> Set-of-RouteNb

"faulty gives the set o f faulty messages buffered in the Transit Node:"

faulty : Tnk -> Set-of-Msg

417

"out.ports gives the set of port-out numbers associated with a route number."

out-ports : 'Ink x RouteNb -> Set-of-PortNb

"control.port-out gives the set of faulty messages waiting to be sent out."

cont~l-port-out : 'Ink-> Set-of-Msg

"data.port-out gives the set of data messages waiting to be sent out."

data-port-out : Tnk x PortNb -> Set-of-DataMsg

"all.msgs.in-data.ports-out builds the set of all the messages waiting in the data-port-out sets."

all-msgs-in-data-ports-out : Tnk x Set-of-PoV._Nb -> Set-of-DataMsg

"aU.timed-out-msgs builds the set of all the timed-out-data-messages that are not yet in the faulty-set."

aU-timed-out-msgs : Tnk x Set-of-PortNb -> Set-of-DataMsg

"timed-out-msgs builds the set of timed-out-messages for a given data-port-out."

fimed-out-msgs : Tnl£ x PortNb -> Set-of-DataMsg

"timed-out-subset builds the set of timed-out-messages from a set of data-messages."

timed-out-subset : "Ink x Set-of-DataMsg -> Set-of-DataMsg

"age gives the age of a message."

age : Tnk x DataMsg -> Time

Ax iomat i c par t o f T R A N S I T . N O D E . K E R N E L

This part presents first the preconditions restricting the definition domain of some generators or defined

operations (their legibility does not require any comment). Then the axioms are given. They are grouped

"by generators", in order to see the effect of each event on the Transit Node. A second version of this

specification, where the axioms are grouped "by defined operations", is given in the Appendix together

with the other specification modules.

Most of the generators describe a change in the system, that is reflected in the specification by the

modification of the result of one among the observers. The other observers are invariant. The axioms

describing an actual change of an observer arc pointed out by an asterisk.

precondit ions add-route-msgs(tn,rn,sopn) is defined when m _< M is t rue

& sopn is included in data-ports(m) is true

add-port-msgs(tn,pn) is defined when pn belongs to data-ports(m) is false

& pn ~ N is true

data-msg-sending(tn,pn) is defined when pn belongs to data-ports(m)

out-ports(re,m) is defined when m belongs to routes(tn)

data-port-out(tn,pn) is defined when pn belongs to data-ports(m)

age(m,dmsg) is defined when dmsg belongs to all-msgs-in-data-ports-out(tn,data-ports(m))

418

a x i o m s

"initial initializes the permanent components of the Transit Node, which are sets."

* data-ports(initial) = the empty set

* mutes(ini t ial) = the empty set

* faulty(initial) = the empty set

* control-port-out(init ial) = the empty set

"add.route-msgs creates a route, therefore it modifies the set of route numbers and defines the set of

out-port numbers associated with this route."

data-por ts (add-route-msgs( tn , rn ,sopn)) = data-ports( tn)

* routes(add-route-msgs( tn , rn , sopn)) = routes( tn) plus rn

faul ty(add-route-rnsgs( tn , rn ,sopn)) = faul ty(m)

* r n l = rn2 => ou t -por t s (add- rou te -msgs( tn , rn l , sopn) , rn2) = sopn

r n l ~ rn2 => out -por t s (add- route - rnsgs( tn , rn l , sopn) , rn2) = out-por ts( tn , rn2)

control-port-out(add-route-rnsgs( tn , rn ,sopn)) = control-por t -out(m)

data-por t -out (add-route-msgs( tn , rn ,sopn) ,pn) = data-port-out( tn ,pn)

* age(add-route-msgs( tn , rn , sopn) ,drnsg) > age( tn ,dmsg)

"send-fault-msgs transfers all the faulty messages (those from the faulty buffer and the timed-out-data-

messages) to the control-port-out buffer."

data-ports(send-faul t -msg(tn)) = data-ports(tn)

routes(send-faul t -msg( tn)) = routes( tn)

* faul ty(send-faul t -msg(tn)) = the empty set

out -por ts (send-faul t -msg( tn) , rn) = out-ports( tn,rn)

* control-port-out(send-faul t -msg(tn)) = control-port-out( tn) union faul ty(m) union al l- t imed-out-msgs(tn,data-port( tn))

* data-por t -out(send-faul t -msg( tn) ,pn) = data-port-out( tn,pn) minus t imed-out-msgs( tn ,pn)

* age(send- fau l t -msgs( tn ,m,sopn) ,dmsg) > age( tn ,dmsg)

"add-port-msgs creates a new data-port in and out, thus it modifies the set of existing data-ports and it

initializes the set associated with this data-port-out."

* data-por ts (add-por t -msgs( tn ,pn)) = data-ports( tn) plus pn

routes(add-por t -msgs( tn ,pn)) = routes( tn)

faul ty(add-por t -msgs( tn ,pn)) = faulty(tn)

ou t -por t s (add-por t -msgs( tn ,pn) ,m) = out-ports( tn,rn)

cont ro l -por t -out (add-por t -msgs(m,pn)) = control-por t -out(m)

* p n l = pn2 => da ta -por t -ou t (add-por t -msgs( tn ,pn l ) ,pn2) = the empty se t

pn 1 ~ pn2 => da ta-por t -out (add-por t -msgs( tn ,pn 1),pn2) = data-por t -out ( tn ,pn2)

* age(add-por t -msgs( tn ,pn) ,dmsg) > age( tn ,dmsg)

"wrong.msgs adds an incorrect control message to the buffer of faulty messages"

data-por ts (wrong-msgs( tn ,crnsg)) = data-por ts(m)

rou tes (wrong-msgs( tn ,cmsg) ) = routes( tn)

419

faulty(wrong-msgs(m,cmsg)) = faulty(m) plus cmsg

out-ports(wroug-msgs(m,cmsg),rn) = out-ports(m,rn)

eontrol-port-out(wrong-msgs(m,cmsg)) = control-port-out(m)

data-port-out(wrong-msgs(m,cmsg),pn) = data-port-out(m,pn)

age(wrong-msgs(m,emsg),dmsg) > age(m,dmsg)

"data-msgs routes a data-message to the buffer of one of the data-port-outs associated with the given

route, if this route exists, otherwise the message becomes faulty."

data-ports(data-msgs(m,pn,dmsg)) = data-ports(m)

routes(data-msgs(m,pn,dmsg)) = routes(m)

route-nb(dmsg) belongs to routes(m) is true => faulty(data-msgs(m,pn,dmsg)) = faulty(m)

* route-nb(dmsg) belongs to routes(m) is false => faulty(data-msgs(m,pn,dmsg)) = faulty(m) plus dmsg

out-ports(data-msgs(m,pn,dmsg),m) = out-ports(re,m)

control-port-out(data-rnsgs(m,pn,dmsg)) = control-port-out(m)

* choose(out-ports(m,route-nb(dmsg))) = pnl => data-port-out(data-msgs(m,pn,dmsg),pnl) = data-port-out(m,pnl) plus dmsg

choose(out-ports(m,route-nb(dmsg))) ~ pnl => data-port-out(data-msgs(m,pn,dmsg),pnl) = data-port-out(tn,pn 1)

* dmsg¢ dmsgl => age(data-msgs(m,pn,dmsgl),dmsg) > age(m,dmsg)

* dmsg-~ dmsgl => age(data-msgs(m,pn,dmsgl),dmsg) > age(data-msgs(m,pn,dmsgl),dmsgl)

"data-msg-sending picks up one of the waiting (non-timed-ouO messages from a given data-port-out

buffer, and sends the message out."

data-ports(data-msg-sending(m,pn)) = data-ports(m)

routes(data-msg-sending(m,pn)) = routes(m)

* faulty(data-msg-sending(tn,pn)) = faulty(m) union dmed-out-msgs(tn,pn)

out-ports(data-msg-sending(m,pn),rn) = out-ports(tn,rn)

control-port-out(data-msg-sending(m,pn)) = control-port-out(tn)

* p n l = pn2 & data-port-out(m,pn2) is-empty is false =>

data-port-out(data-msg-sending(m,pnl),pn2)

= remove chosen(data-port-out(tn,pn2 ) minus timed-out-msgs(m,pn2) )

* p n l = pn2 & data-port-out(m,pn2) is-empty is true =>

data-port-out(data-msg-sending(m,pnl),pn2) = data-port-out(m,pn2)

pnl ~pn2 => data-port-out(data-msg-sending(m,pnl),pn2)= data-port-out(m,pn2)

* age(data-msg-sending(m,pn),dmsg) > age(m,dmsg)

"faulty-msg-sending picks up one of the faulty messages from the control-port-out b~er , and sends it out of the transit-node."

data-ports(faulty-msg-sending(m)) =dam-ports(m)

routes(faulty-msg-sending(m)) = routes(m)

faulty(faulty-msg-sending(m)) = faulty(m)

420

out-ports(faulty-msg-sending(tn),rn) = out-ports(tn,m)

control-port-out(tn) is-empty is false =>

control-port-out(faulty-msg-sending(tn)) = remove chosen(control-port-out(m))

control-port-out(m) is-empty is t rue =>

control-port -out( faulty-msg-sendin g(m ) ) = control-port-out(m )

data-port-out(fauhy-msg-sending(m),pn) = data-port-out(m,pn)

age(faulty-msg-sending(tn),dmsg) > age(m,dmsg)

"idle has no effect on the composition of the various sets, it only affects the age of each message; without

such an operation in the specification, it is impossible to reflect the fact that messages are becoming older

even when no external events occur."

data-ports(idle(tn)) = data-ports(tn)

routes(idle(m)) = routes(tn)

faulty(idle(m)) = faulty(tn)

out-ports(idle(tn),m) = out-ports(tn,m)

control-port-out(idle(tn)) = control-port-out(tn)

data-port-out(idle(tn),pn) = data-port-out(tn,pn)

* age(idle(m),dmsg) > age(m,drnsg)

"Other defined operations"

" all-msgs-in-data-ports-out"

* aU-msgs-in-data-ports-out(tn,the empty set) = the empty set

* all-msgs-in-data-ports-out(tn,sodm plus pn) = all-msgs-in-data-ports-out(tn,sodm) union data-port-out(tn,pn)

" all-timed-out-ms g s"

* all-timed-out-msgs(m,the empty set) = the empty set

* all-timed-out-msgs(tn,sodm plus pn) =

aU-timed-out-msgs(tn,sodrn) union timed-out-msgs(tn,pn)

" timed-out-rns g s"

* timed-out-msgs(tn,pn) = timed-out-subset(m,data-port-out(tn,pn))

"timed-out-subset"

* timed-out-subset(tn,the empty set) = the empty set

* age(m,dmsg) > T => timed-out-subset(m,sodm plus dmsg) =

timed-out-subset(sodm) plus dmsg

age(m,dmsg) ~ T => timed-out-subset(m,sodm plus dmsg) = timed-out-subset(sodm)

w h e r e

m : Tnk, pn, pn l , pn2, N : PortNb, m, rnl , rn2 : RouteNb,

emsg : ControlMsg, dmsg : DataMsg, sopn : Set-of-PortNb, sodm : Set-of-DataMsg

e n d TRANSIT-NODE-KERNEL

421

8. CONCLUSION: Comparisons between the two specifications

It is interesting to point out the differences between the PLUSS and ERAE specifications, especially those

induced by the different formalisms. As said previously, these differences concern, more or less directly,

the way the time is taken into account.

A ftrst difference is the way the messages are considered. In the ERAE specification, there is no explicit

notion of a set of messages in the Transit Node: some predicates on the occurrences of events make it

possible to determine when a message is not yet arrived in the Transit Node, or already sent out. In the

PLUSS specification, there are several sets of messages that contain only messages present in the Transit

Node. This difference is not surprising: ERAE is based on events description and temporal logic; PLUSS is

based on data types definition.

In the ERAE specification, the use of temporal logic makes it possible to specify ordering constraints on

events (see for instance the way the serialization of each port is described). These constraints do not imply

(generally) a total order: in the case study the possibility that some events occur in parallel is open (it is the

case, for instance, for events concerning different data ports). Roughly speaking, one could say that, in this

framework, serialisation must be explicit and parallelism is implicit.

In the PLUSS specification, such constraints are not systematically expressed (they are expressed in some

case via preconditions). The specification as it is presented here describes the properties of each operation

(i.e. elementary action in the terminology of [KP 88]) of the Transit Node. The only ordering constraints

are that the operands of an operation must be computed before the operation is performed. In some way, a

partial serialization is implicit in this framework.

Parallelism should be explicitely expressed. A f'trst attempt to extend the PLUSS specification, using

Process Specifications ~ la Kaplan-Pnueli is given in [Kap 89]. The main change consists in modifications

of the TRANSIT-NODE module in order to associate a process to each data port. However, the resulting

specification is more precise than the ERAE specification, which only states that events on different data

ports can be simultaneous.

REFERENCES

422

[Bid 89] Bidoit M., "PLUSS a Language for the Development of Modular Algebraic Specifications", Th~se d~tat, LRI, May 1989.

[BGM 87] Bidoit M., Gaudel M.-C. and Mauboussin A.,

"How to Make Algebraic Specifications more Understandable ? An Experiment with the

PLUSS Specification Language", Proceedings of the METEOR Workshop on Algebraic

Specifications, Passau, June 1987.

pij 88] Dijkstra E.W,

"Position Paper on "Fairness" ", Software Engineering Notes, Vol 13, n°2, April 1988,

pp. 18-20.

[DHR 88] E. Dubois, J.Hagelstein and A.Rifaut,

"Formal Requirements Engineering with ERAE", Philips Journal of Research, vol. 43,

3/4,1988, pp.393-414. (A revised version is available from the authors.)

[Gau 85] Gaudel M.-C. "Towards Structured Algebraic Specifications", Esprit Technical Week, Bruxelles, September

I985, Proceedings of Esprit'85 Status Report, North-Holland, pp.493-510.

[Hag 89] Hagelstein J., "The ERAE Language Definition", Philips Research Laboratory Brussels,June 1989.

[KP 88] Kaplan S. and Pnueli A., "Specification and Implementation of Concurrently Accessed Data Structures: An Abstract Data

Type Approach", Proceedings of the STACS 87 Conference, LNCS 247, Springer Verlag.

[Kap 89] Kaplan S.,

"The Transit Node via Process Specifications", Draft, July 1989.

[RU 71] N.Rescher and A.Urquhart, "Temporal logic", Springer Verlag, 1971.

[SL 88] Schneider F.B. and Lamport L , "Another Position Paper on "Fairness" ", Software Engineering Notes, Vol 13, n°3, July 1988, pp. 18-19.

[TW 86] Tarlecki A. and Wirsing M.,

"Continuous Abstract Daa Types", Fundamenta Informaticae 9, 1986, pp.95-125.

A P P E N D I X

423

spec TRANSIT-NODE


so r t Tn

g e n e r a t o r s

init : -> Tn

cm-arrival : Tn x ControlMsg -> Tn

din-arrival : Tn x PortNb x DataMsg -> Tn

o p e r a t i o n s

effect : Tn -> Tnk

p r e c o n d i t i o n s

dm-arrival(tn, pn, dmsg) is defined when pn belongs to data-ports(effect(tn))

a x i o m s

effect(ini0 = initial

name of cmsg = add-route =>

effect(cm-arrival(tn,cmsg)) =

add-route-msgs(effect(tn), mute-nb(emsg), set-of-port-nb(cmsg))

name of cmsg = send-fault =>

effect(cm-arrival(m,cmsg)) = send-fault-msgs(effect(tn))

name of cmsg = add-port =>

effect(cm-arrival(m,cmsg)) = add-port-msgs(effect(tn), port-nb(cmsg))

name of cmsg = add-route is false &

name o f cmsg = send-fault is false &

name of cmsg = add-port is false =>

effect(cm-arrival(tn,cmsg)) = wrong-msgs(effect(tn), cmsg)

effect(dm-arrival(tn, pn, dmsg)) -- data-msgs(effect(m), pn, dmsg)

w h e r e

tn : Tn, pn : PortNb, cmsg : ControlMsg, dmsg : DataMsg

end TRANSIT-NODE

424

spee TRANSIT-NODE-KERNEL

use SET-OF-PORT, SET-OF-ROUTE, SET-OF-MSG, SET-OF-DATA-MSG,

BOUNDED-TIME

sor t Tnk

generators initial : -> Tnk

add-route-msgs : Tnk x RouteNb x Set-of-PortNb -> Tnk

send-fault-msgs : Tnk -> Tnk

add-port-msgs : Tnk x PortNb -> Tnk

wrong-msgs : Tnk x ControlMsg -> Tnk

data-msgs : Tnk x PortNb x DataMsg -> Tnk

data-msg-sending : Tnk x PortNb -> Tnk

faulty-msg-sending : Tnk -> Tnk

idle : Tnk-> Tnk

operations data-ports

routes

faulty

out-ports

: Tnk -> Set-of-PortNb

: Tnk-> Set-of-RouteNb

: Tnk -> Set-of-Msg

: Tnk x RouteNb -> Set-of-PortNb

control-port-out : Tnk -> Set-of-Msg

data-port-out : Tnk x PortNb -> Set-of-DataMsg

all-msgs-in-data-ports-out : Tnk x Set-of-PortNb -> Set-of-DataMsg

all-firned-out-msgs : Tnk x Set-of-PortNb -> Set-of-DataMsg

timed-out-msgs : Tnk x PortNb -> Set-of-DataMsg

timed-out-subset : Tnk x Set-of-DataMsg -> Set-of-DataMsg

age : Tnk x DataMsg -> Time


add-route-msgs(tn,rn,sopn) is defined when rn <_ M is t rue

& sopn is included in data-ports(m) is true

add-port-msgs(tn,pn) is defined when pn belongs to data-ports(m) is false

& pn _< N is t rue

data-msg-sending(tn,pn) is defined when pn belongs to data-ports(tn)

out-ports(tn,rn) is defined when rn belongs to routes(tn)

data-port-out(tn,pn) is defined when pn belongs to data-ports(m)

age(m,drnsg) is defined when dmsg belongs to all-msgs-in-data-ports-out(m,data-ports(m))

425

a x i o m s

"data-ports"

* data-ports(initial) = the empty set

data-ports(add-route-msgs(m,m,sopn)) = data-ports(m)

data-ports(send-fault-msg(m)) = data-ports(tn)

* data-ports(add-port-msgs(tn,pn)) --- data-ports(m) plus pn

data-ports(wrong-msgs(tn,cmsg)) = data-ports(m)

data-ports(data-msgs(tn,pn,dmsg)) = data-ports(tn)

data-ports(data-msg-sending(m,pn)) = data-ports(tn)

data-ports(faulty-msg-sending(m)) = data-ports(m)

data-ports(idle(m)) = data-ports(tn)

"routes"

"faulty"

*'out-ports"

routes(initial) = the empty set

routes(add-route-msgs(m,rn,sopn)) = routes(m) plus m

routes(send-fault-msg(tn)) = routes(tn)

routes(add-port-msgs(m,pn)) = routes(tn)

routes(wrong-msgs(m,cmsg)) = routes(tn)

routes(data-msgs(tn,pn,dmsg)) = routes(m)

routes(data-msg-sending(tn,pn)) = routes(m)

routes(faulty-msg-sending(m)) = routes(m)

routes(idle(m)) -- routes(m)

faulty(initial) = the empty set

faulty(add-route-msgs(m,rn,sopn)) = faulty(tn)

faulty(send-fault-msg(tn)) = the empty set

faulty(add-port-msgs(m,pn)) = faulty(m)

faulty(wrong-msgs(tn,cmsg)) = faulty(tn) plus cmsg

route-nb(dmsg) belongs to routes(m) is t rue => faulty(data-msgs(tn,pn,dmsg)) = faulty(m)

route-nb(dmsg) belongs to routes(m) is false =>

faulty(data-msgs(tn,pn,dmsg)) = faulty(m) plus dmsg

faulty(data-msg-sending(m,pn)) = faulty(tn) union timed-out-msgs(m,pn)

faulty(faulty-msg-sending(tn)) = faulty(tn)

faulty(idle(in)) = faulty(m)

r n l = rn2 => out-ports(add-route-msgs(m,rnl ,sopn),m2) = sopn

m l ¢ rn2 => out-ports(add-route-msgs(tn,ml,sopn),rn2) = out-ports(tn,rn2)

out-ports(send-fault-msg(tn),rn) = out-ports(m,rn)

out-ports(add-port-msgs(tn,pn),m) = out-ports(tn,rn)

out-ports(wrong-msgs(m,cmsg),rn) = out-ports(tn,rn)

out-ports(data-msgs(tn,pn,dmsg),m) = out-ports(re,m)

out-ports(data-msg-sending(tn,pn),m) = out-ports(m,rn)

426

out-ports(faulty-msg-sending(m),rn) = out-ports(tn,rn)

out-ports(idle(tn),rn) = out-ports(tn,m)

"control-port-out" * control-port-out(initial) = the empty set

eontrol-port-out(add-~ute-msgs(tn,rn,sopn)) = control-port-out(tn)

* eontrol-port-out(send-fault-msg(tn)) = control-port-out(tn) union faulty(tn)

union all-timed-out-msgs(m,data-port(tn))

eontrol-port-out(add-port-msgs(tn,pn)) = eontrol-port-out(tn)

eontrol-port-out(wrong-msgs(tn,emsg)) = control-port-out(tn)

eontrol-port-out(data-msgs(tn,pn,dmsg)) = control-port-out(tn)

eontrol-port-out(data-msg-sending(m,pn)) = control-port-out(tn) * control-port-out(tn) is-empty is false =>

eontrol-port-out(faulty-msg-sending(tn)) = remove chosen(control-port-out(tn)) * control-port-out(tn) is-empty is t rue =>

eontrol-port-out(faulty-msg-sending(tn)) = control-port-out(tn)

control-port-out(idle(m)) = control-port-out(tn)

"data-port-out" data-port-out(add-route-msgs(tn,rn,sopn),pn) = data-port-out(m,pn)

data-port-out(send-fault-msg(m),pn) = data-port-out(tn,pn) minus timed-out-msgs(tn,pn)

* p n l = pn2 => data-port-out(add-port-msgs(tn,pnl),pn2) = the empty set

pn 1 # pn2 => data-port-out(add-port-msgs(tn,pn 1),pn2) = data-port-out(tn,pn2)

data-port-out(wrong-msgs(tn,emsg),pn) = data-port-out(m,pn)

* choose(out-ports(m,route-nb(dmsg))) = pn 1 =>

data-port-out(data-msgs(m,pn,dmsg),pn 1) = data-port-out(tn,pn 1) plus dmsg

ehoose(out-ports(tn,route-nb(dmsg))) ~ pnl =>

data-port-out(data-msgs(m,pn,dmsg),pn 1 ) = data-port-out(m,pn 1 ) * p n l = pn2 & data-port-out(tn,pn2) is-empty is false =>

data-port-out(data-msg-sending(tn,pnl),pn2) --- remove ehosen(data-port-out(tn,pn2) minus timed-out-msgs(tn,pn2))

* p n l = pn2 & data-port-out(tn,pn2) is-empty is true => data-port-out(data-msg-sending(tn,pnl),pn2) = data-port-out(m,pn2)

pnl # p n 2 => data-port-out(data-msg-sendlng(tn,pnl),pn2) = data-port-out(tn,pn2)

data-port -out(faulty-msg-sending(tn) ,pn ) = data-port -out(tn,pn )

data-port-out(idle(tn),pn) = data-port-out(tn,pn)

"age" age(add-route-msgs(tn,rn,sopn),dmsg) > age(tn,dmsg)

age(send-fault-msgs(tn,rn,sopn),dmsg) > age(tn,dmsg)

age(add-port-msgs(tn,pn),drnsg) > age(m,dmsg)

age(wrong-msgs(tn,cmsg),dmsg) > age(tn,dmsg)

dmsg# dmsgl --> age(data-msgs(tn,pn,dmsgl),dmsg) > age(tn,dmsg)

dmsg# dmsgt => age(data-msgs(tn,pn,dmsgl),dmsg) > age(data-msgs(m,pn,dmsgt),dmsgl)

age(data-msg-sending(tn,pn),dmsg) > age(tn,dmsg)

age(faulty-msg-sending(tn),dmsg) > age(tn,dmsg)

age(idle(m),dmsg) > age(tn,dmsg)

427

"Other defined operations"

"all-msgs-in-data-ports-out"

* all-msgs-in-data-ports-out(m,the empty set) = the empty set

* all-msgs-in-data-ports-out(tn,sodm plus pn) = all-msgs-in-data-ports-out(tn,sodm)

union data-port-out(tn,pn)

"all-timed-out-msgs"

* all-timed-out-msgs(tn,the empty set) = the empty set

* all-timed-out-msgs(tn,sodm plus pn) = all-timed-out-msgs(m,sodm)

union timed-out-msgs(tn,pn)

" timed-out-ms g s"

* timed-out-msgs(tn,pn) = timed-out-subset(m,data-port-out(m,pn))

"timed-out-subset"

* timed-out-subset(tn,the empty set) = the empty set

* age(tn,dmsg) > T => timed-out-subset(tn,sodm plus dmsg)

= timed-out-subset(sodm) plus dmsg

age(tn,dmsg) _< T => timed-out-subset(m,sodm plus dmsg) = timed-out-subset(sodm)

where

m : Tnk, pn, pnl , pn2, N : PortNb, m, rnl , rn2 : RouteNb,

cmsg : ControlMsg, dmsg : DataMsg, sopn : Set-of-PortNb, sodm : Set-of-DataMsg

end TRANSIT-NODE-KERNEL

spee SET-OF-PORT as SET-OF (PORT)

renaming Set in to Set-of-PortNb

end SET-OF-PORT

spee SET-OF-ROUTE as SET-OF (ROUTE)

renaming Set in to Set-of-RouteNb

end SET-OF-ROUTE

s p e c SET-OF-MSG as SET-OF (MSG)

renaming Set in to Set-of-Msg

end SET-OF-MSG

s p e e SET-OF-DATA-MSG as SET-OF (DATA-MSG)

renaming Set in to Set-of-DataMsg

end SET-OF-DATA-MSG

spec PORT as INTERVAL (N)

r e n a m i n g Inter i n t o PortNb

end PORT

428

spec ROUTE as INTERVAL (M)

r e n a m i n g Inter i n to RouteNb

end ROUTE

spee BOUNDED-TIME

use TIME

o p e r a t i o n

T : -> Time

end BOUNDED-TIME

spec TIME

s o r t Time

p r ed i ca t e s

__<_ : Time x Tm~

> : Time x Time

a x i o m s

t < t ' i f f not t > t'

end TIME

spec MSG

use DATA-MSG, CONTROL-MSG

s o r t Msg

o p e r a t i o n s

_ : DataMsg -> Msg

_ : ControlMsg -> Msg

end MSG

429

spec DATA-MSG

u s e ROUTE

sort DataMsg

o p e r a t i o n s

route-nb : DataMsg -> RouteNb

. , .

end DATA-MSG

spee CONTROL-MSG

use MSG-NAME, ROUTE, SET-OF-PORT

sor t ControlMsg

o p e r a t i o n s

add-route

send-fault

add-port

name of

route-nb

port-rib

set-of-port-nb

. , .

a x i o m s

end CONTROL-MSG

: -> MsgN

: -> MsgN

: -> MsgN

: ControlMsg -> MsgN

: ControlMsg -> RouteNb

: ControlMsg -> PortNb

: ControlMsg -> Set-of-PortNb

spec MSG-NAME

so r t MsgN

end MSG-NAME

430

p r o c S E T - O F ( ITEM)

s o r t Set

g e n e r a t o r s

the empty set : -> Set

_ plus _ : Set x I tem-> Set

o p e r a t i o n s

_ union _ : Set x Set -> Set

_ minus _ : Set x Set -> Set

_ less _ : Set x I tem -> Set

choose : Set -> I tem

remove chosen : Set -> Set

p r e d i c a t e s

_ is empty : Set

_ be longs to _ : I tem x Set

_ is included in _ : Set x Set


choose (S ) is d e f i n e d w h e n S is emp ty is fa l se

r e m o v e chosen(S) is d e f i n e d w h e n S is empty is fa lse

a x i o m s

"union"

S union the empty set = S

S un ion (S' plus i) = (S plus i) union S'

"minus"

"less"

"choose"

S minus the empty set = S

S minus (S' p lus i) = (S less i) minus S'

S is empty => S less i = S

i l = i2 => (S plus i l ) less i2 = S less i2

i l ~ i2 => (S plus i l ) less i2 = (S less i2) plus i l

choose(S) be longs to S is t r u e

"remove chosen"

r e m o v e chosen (S) =S less choose(S)

"is empty"

the empty set is empty is t r u e

(S p lus i) is e m p t y is fa l se

"belongs to"

i be longs to the empty set is fa lse

i l = i2 => i l be longs to (S plus i2) is t r u e

i l . i2 => i l be longs to (S plus i2) = i l be longs to S

431

"is included in"

the empty set is included in S is t rue

i belongs to S1 is t rue => ($2 plus i) is included in S1 = $2 is included in S1

i belongs to S 1 is false => ($2 plus i) is included in S 1 is false

w h e r e

S, S1, $2 : Set, i, i t , i2 : Item

end SET-OF

Subject Index

abstract syntax 105, 241,365

ACP 203, 303, 314, 333, 341

algebraic specification 147, 300, 339,

341, 363, 395

algebraic specification language 303, 339

ALGRES 339

ASF 105, 107, 203, 303, 363

AUTOMATH 170

bisimulation s~=mantics 341

black-box correctness 171

BMASF 363

case studies and examples

alternating bit-protocol 324

BLISS 133

chess tournaments 8

gas station 143

landing control system 321

library desk 49

Norman's data base 205

naturals and booleans 107, 345, 373

parallel zero search 62

PCTE 99

POLAR 275

satellite tuner 132

simple language 108

SPESI 133

studio booking system 132

Swiss system 8

television 134

Timbuktu airport 322

TIPTOP 132

transit node 342, 397

typechecker 108

vending machine 320

74283 four bit adder 173

CCS 334

Church-Rosser 17

class description 210, 278

class 278

COLD 99, 167, 277, 303, 304

COLD-K 205, 236,304

COLD-Static 303

communication 165, 314, 352

communication protocol 324

concrete syntax 105, 241

data base 205

design 167, 156, 343

design process 55, 143, 172

distributed systems 153

environment generation 105

ERAE t5, 127, 129, 165, 341,395

examples: see case studies

fairness 55

FOREST 30

formalization process 16, 129, 143

glass-box correctness 171

hacldng 30

implementation freedom 179

import relation 233, 359

inheritance 277

initial semantics 333

knowledge transfer 24

lambda calculus 168, 237

language definition formalism 105

LOTOS 334

MAL 5, 30

Meta IV 333

meta-environment 105

methodology of language design 83, 105,

233, 303, 366

MPL~ 92

modular implementation techniques 105

modutarisation 218, 161, 168, 233

module algebra 233, 374

multiple inheritance 289

NYCE 46

parallel programs 55

parameterization 237

PLUSS 15, 395, 409

POLAR 233

process algebra 203, 303, 314, 333, 341

programming language semantics 105

PSF/C 303

PSFd 341

RAP 15, 339

requirements engineering 7, 129

SDF 105, 310

SMoLCS 334

state-based specifications 44, 57, 86, 207,

278

streams of actions 143

434

Structured Common Sense 31

temporal logic 15, 89, 360, 399

tool support 24, 36, 105, 140, 198, 246,300,

303

traditional object-oriented languages 279

transformations 57, 90, 167, 411

user-definable syntax 105

VDM 15, 83, 99, 167, 205

VVSL 83

Z 85, 167

algebraic methods ii: theory, tools and applications

Documents