abstractions for improving, creating, and reusing object ... · programming environments, the new...

S KATHOLIEKE UNIVERSITEIT LEUVEN

FACULTEIT INGENIEURSWETENSCHAPPENDEPARTEMENT COMPUTERWETENSCHAPPENAFDELING INFORMATICACelestijnenlaan 200A — 3001 Leuven

Abstractions for Improving, Creating, and ReusingObject-Oriented Programming Languages

Jury :Prof. Dr. ir. G. Van der Perre, voorzitterProf. Dr. ir. E. Steegmans, promotorProf. Dr. T. HolvoetProf. Dr. ir. W. JoosenProf. Dr. ir. F. PiessensProf. Dr. T. D’Hondt

(Vrije Universiteit Brussel, Belgium)Prof. Dr. E. Ernst

(University of Aarhus, Denmark)

Proefschrift voorgedragen tothet behalen van het doctoraatin de ingenieurswetenschappen

door

Marko VAN DOOREN

U.D.C. 681.3∗D15, 681.3∗D2, 681.3∗D33

Juni 2007

c©Katholieke Universiteit Leuven – Faculteit IngenieurswetenschappenArenbergkasteel, B-3001 Heverlee (Belgium)

Alle rechten voorbehouden. Niets uit deze uitgave mag worden vermenigvuldigden/of openbaar gemaakt worden door middel van druk, fotocopie, microfilm,elektronisch of op welke andere wijze ook zonder voorafgaande schriftelijketoestemming van de uitgever.

All rights reserved. No part of the publication may be reproduced in any form byprint, photoprint, microfilm or any other means without written permission fromthe publisher.

Abstract

With computer programs become more and more complex, there is an increasingneed for technology to increase the reusability of software. Reusing software has anumber of beneficial effects on the development process. First, depevelopment willspeed up and costs will go down because the code must not be implemented again.Second, the resulting software will generally be more stable because no bugs canbe introduced in the functionality that is being reused.

In a first part of this thesis, we introduce two new language constructs toincrease reusability of software. The constructs address problems with exceptionhandling and composition of software.

The first language construct increases the flexibility of the exception handlingfacilities in a programming language. Ever since their invention 30 years ago,checked exceptions have been a point of much discussion. On the one hand, theyincrease the robustness of software by preventing the manifestation of unantici-pated checked exceptions at run-time. On the other hand, they decrease the adapt-ability of software because they must be propagated explicitly, and must often behandled even if they cannot be signalled. We show that the problems with checkedexceptions are caused by a lack of expressiveness of the exceptional return typeof a method, which currently dictates a copy & paste style. We add the requiredexpressiveness by introducing anchored exception declarations, which allow the ex-ceptional behavior of a method to be declared relative to that of others. We presentthe formal semantics of anchored exception declarations, along with the necessaryrules for ensuring compile-time safety, and give a proof of soundness. We showthat anchored exception declarations do not violate the principle of informationhiding when used properly, and provide a guideline for when to use them.

The second language construct enables classes to be composed by using classesas building blocks. Although classes are a fundamental concept in object-orientedprogramming, a class itself cannot be built using general purpose classes as build-ing blocks in a practical manner. High-level concepts going from simple conceptslike associations, bounded values, and infrastructure for event mechanisms to com-plex concepts like graph structures, and arbitrary collaborations cannot be reusedconveniently as components for classes. As a result, they are implemented over andover again. We raise the abstraction level of the language with a code inheritancerelation for reusing general purpose classes as components for other classes. Fea-tures like mass renaming, first-class relations, high-level dependencies, componentparameters, and indirect inheritance ensure that maximal reuse can be achievedwith minimal effort. A case study shows a reduction of the code between 21% and36%, while the closest competitor only reduces the size between 3% and 12%.

The development of these language constructs, however, revealed problemswith the process of creating programming languages. A first problem is that tosuccessfully create a new programming language, it no longer suffices to createonly a compiler. Because programmers are now used to working with powerful

programming environments, the new language will rarely be used if it does notprovide similar programming tools. With current technology, these tools must beimplemented from scratch, which is an enormous task. A second problem is thatprogramming tools must currently incorporate a part of the language semantics inorder to perform their job. Current tools transform the source code of a program toan abstract syntax tree, which contains only that data of the elements in the pro-gram, but not their semantics. As a result, the language semantics are duplicatedin the tools, causing bugs, and fixing the tools to a single programming language.To address these problems, we created a new architecture for programming lan-guages and programming tools. Central in this architecture is Chameleon, ourframework for metamodels of programming languages. The different layers of theChameleon framework offer an abstract view on different families of programminglanguages. The layers hide the details of concrete languages. which are imple-mented in language modules. A programming tool can then use the appropriatelayer of Chameleon to perform its job independent of any specific language, andleave the language dependent work to the language modules. In case the layersdo not offer sufficient functionality, an abstract tool extension must be created tooffer this functionality. This extension is then implemented for every language forwhich the programming tool is to be used.

Acknowledgements

i

Contents

I Introduction 1

1 Introduction 31.1 The Evolution of Programming Languages . . . . . . . . . . . . . . 31.2 The Evolution of Programming Language Development . . . . . . 51.3 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4.1 Exception handling . . . . . . . . . . . . . . . . . . . . . . . 61.4.2 Composition of Abstract Data Types . . . . . . . . . . . . . 71.4.3 Programming Language Development . . . . . . . . . . . . 10

II Programming Language Constructs 13

2 Anchored Exception Declarations 152.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2 Copy & Paste for Exceptions . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Reduced Adaptability . . . . . . . . . . . . . . . . . . . . . 182.2.2 Loss of Context Information . . . . . . . . . . . . . . . . . . 20

2.3 Anchored Exception Declarations . . . . . . . . . . . . . . . . . . . 202.3.1 Informal Semantics and Rules . . . . . . . . . . . . . . . . . 222.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.3.3 Type Parameters . . . . . . . . . . . . . . . . . . . . . . . . 24

2.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.5 Formal Semantics and Rules . . . . . . . . . . . . . . . . . . . . . . 26

2.5.1 Formal notation . . . . . . . . . . . . . . . . . . . . . . . . 262.5.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5.3 Exploiting Context Information . . . . . . . . . . . . . . . . 292.5.4 Restrictions on Anchored Declarations . . . . . . . . . . . . 302.5.5 Proof of Compile-time Safety . . . . . . . . . . . . . . . . . 40

2.6 Methodological Discussion . . . . . . . . . . . . . . . . . . . . . . . 422.6.1 Information Hiding . . . . . . . . . . . . . . . . . . . . . . . 42

iii

iv CONTENTS

2.6.2 Usefulness of Source Code Modifications . . . . . . . . . . . 432.6.3 Nominal or Structural Typing? . . . . . . . . . . . . . . . . 45

2.7 Comparison with Type Anchors . . . . . . . . . . . . . . . . . . . . 452.8 Translating Cappuccino to Java . . . . . . . . . . . . . . . . . . . . 472.9 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3 Composition of Abstract Data Types 533.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.2 Requirements Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 573.2.2 Existing Reuse Mechanisms . . . . . . . . . . . . . . . . . . 61

3.3 The Component Relation . . . . . . . . . . . . . . . . . . . . . . . 643.3.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.3.2 General Semantics . . . . . . . . . . . . . . . . . . . . . . . 653.3.3 Renaming Parameters . . . . . . . . . . . . . . . . . . . . . 663.3.4 First-Class Component Relations . . . . . . . . . . . . . . . 68

3.4 The Subtyping Relation . . . . . . . . . . . . . . . . . . . . . . . . 753.4.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.4.2 General Semantics . . . . . . . . . . . . . . . . . . . . . . . 763.4.3 Overriding and Merging Components . . . . . . . . . . . . . 773.4.4 Reducing Hierarchy Dependencies . . . . . . . . . . . . . . 79

3.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.5.1 Methodological Discussion . . . . . . . . . . . . . . . . . . . 793.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 803.5.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.6 Formal Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.6.1 Method Lookup . . . . . . . . . . . . . . . . . . . . . . . . 843.6.2 Expression Typing . . . . . . . . . . . . . . . . . . . . . . . 903.6.3 Reduction Rules . . . . . . . . . . . . . . . . . . . . . . . . 913.6.4 Proof Of Type Soundness . . . . . . . . . . . . . . . . . . . 91

3.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 923.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

III Programming Language Development 101

4 Programming Language Development 1034.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.2 The Architecture of Chameleon . . . . . . . . . . . . . . . . . . . . 1064.3 Programming Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.3.1 Code Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094.3.2 CASE Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

CONTENTS v

4.3.3 Other Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 1154.4 Language Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

4.4.1 Input Module . . . . . . . . . . . . . . . . . . . . . . . . . . 1164.4.2 Language Constructs . . . . . . . . . . . . . . . . . . . . . . 1194.4.3 Language Semantics . . . . . . . . . . . . . . . . . . . . . . 1204.4.4 Output Module . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.5 The Chameleon Framework . . . . . . . . . . . . . . . . . . . . . . 1204.5.1 Abstractions . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.5.2 Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1264.5.3 Language Semantics . . . . . . . . . . . . . . . . . . . . . . 128

4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1374.6.1 Extensible Compilers . . . . . . . . . . . . . . . . . . . . . . 1374.6.2 Meta Object Facility . . . . . . . . . . . . . . . . . . . . . . 1384.6.3 Extensible IDEs . . . . . . . . . . . . . . . . . . . . . . . . 1404.6.4 Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

IV Conclusion 143

5 Conclusions and Future Work 1455.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1455.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1475.3 Our Vision on Programming Language Development . . . . . . . . 150

5.3.1 Conceptual Development Phase . . . . . . . . . . . . . . . . 1505.3.2 Theoretical Development Phase . . . . . . . . . . . . . . . . 1525.3.3 Technical Development Phase . . . . . . . . . . . . . . . . . 152

Bibliography 155

List of Publications 173

V Appendices 175

A Proof of Compile-time Safety 177A.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177A.2 Extension to the relation . . . . . . . . . . . . . . . . . . . . . . 177A.3 Sets of types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178A.4 Properties of Φ and Ω . . . . . . . . . . . . . . . . . . . . . . . . . 178A.5 Properties of the ω Function . . . . . . . . . . . . . . . . . . . . . . 182

A.5.1 Exception Declarations . . . . . . . . . . . . . . . . . . . . 182A.5.2 Exception Clauses . . . . . . . . . . . . . . . . . . . . . . . 183

vi CONTENTS

A.6 Properties of the relation . . . . . . . . . . . . . . . . . . . . . . 184A.7 Overview of Dependencies . . . . . . . . . . . . . . . . . . . . . . . 185A.8 The relation is transitive . . . . . . . . . . . . . . . . . . . . . . 186

A.8.1 Absolute Exception Declarations . . . . . . . . . . . . . . . 186A.8.2 Method expressions . . . . . . . . . . . . . . . . . . . . . . 186A.8.3 Anchored Exception Declarations . . . . . . . . . . . . . . . 187A.8.4 Exception Clauses . . . . . . . . . . . . . . . . . . . . . . . 188

A.9 Φ is monotone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189A.9.1 Absolute Exception Declarations . . . . . . . . . . . . . . . 189A.9.2 Anchored Exception Declarations . . . . . . . . . . . . . . . 190A.9.3 Exception Clauses . . . . . . . . . . . . . . . . . . . . . . . 192

A.10 Ω is monotone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194A.10.1 Absolute Exception Declarations . . . . . . . . . . . . . . . 194A.10.2 Method Expressions . . . . . . . . . . . . . . . . . . . . . . 194A.10.3 Anchored Exception Declarations . . . . . . . . . . . . . . . 196A.10.4 Exception Clauses . . . . . . . . . . . . . . . . . . . . . . . 198

A.11 The Implementation Exception Clause is an Upper Bound . . . . . 199A.12 Method Invocations Maintain Compatibility . . . . . . . . . . . . . 200A.13 The relation implies the ω relation . . . . . . . . . . . . . . . . . 200

A.13.1 Absolute Exception Declarations . . . . . . . . . . . . . . . 200A.13.2 Anchored Exception Declarations . . . . . . . . . . . . . . . 201A.13.3 Exception Clauses . . . . . . . . . . . . . . . . . . . . . . . 203

A.14 Expansion Does Not Allow More Than the Exception Clause . . . 204A.15 Compile-time safety . . . . . . . . . . . . . . . . . . . . . . . . . . 208

B Type System for the Component Relation 209B.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209B.2 Type Elaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210B.3 Subtyping and Subclassing . . . . . . . . . . . . . . . . . . . . . . 210B.4 Class Well-formedness . . . . . . . . . . . . . . . . . . . . . . . . . 210B.5 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

B.5.1 Component Well-formedness . . . . . . . . . . . . . . . . . 216B.6 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

B.6.1 Field Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . 217B.6.2 Field Well-formedness . . . . . . . . . . . . . . . . . . . . . 219

B.7 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223B.7.1 Method Lookup . . . . . . . . . . . . . . . . . . . . . . . . 223B.7.2 Method Well-formedness . . . . . . . . . . . . . . . . . . . . 224

B.8 Auxiliary functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 224B.8.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

B.9 Expression Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . 224B.10 Reduction Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230B.11 Proof of Type Soundness . . . . . . . . . . . . . . . . . . . . . . . . 233

CONTENTS vii

B.11.1 Subject Reduction . . . . . . . . . . . . . . . . . . . . . . . 233B.11.2 Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236B.11.3 Type Soundness . . . . . . . . . . . . . . . . . . . . . . . . 236

viii CONTENTS

List of Figures

1.1 Diamond inheritance. . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Copy & paste for checked exceptions. . . . . . . . . . . . . . . . . . 172.2 The lack of expressiveness of the exceptional interface of a method. 182.3 Problems with checked exceptions. . . . . . . . . . . . . . . . . . . 192.4 A grammar for anchored exception declarations. . . . . . . . . . . 212.5 Anchored exception declarations read like a sentence. . . . . . . . . 222.6 The example using anchored exception declarations. . . . . . . . . 242.7 Operations on sets of types. . . . . . . . . . . . . . . . . . . . . . . 272.8 List of symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.9 Definition of the expansion function. . . . . . . . . . . . . . . . . . 322.10 The conformance relation . . . . . . . . . . . . . . . . . . . . . . 342.11 Calculating the useful remainder of anchored exception declarations. 372.12 Compression of anchored exception declarations. . . . . . . . . . . 382.13 Calculation of the implementation exception clause. . . . . . . . . 382.14 Calculating the implementation exception clause. . . . . . . . . . . 392.15 Schematic proof of soundness. . . . . . . . . . . . . . . . . . . . . . 422.16 Adding a new checked exception. . . . . . . . . . . . . . . . . . . . 442.17 Generated Java code. . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.1 High-level design of an application. . . . . . . . . . . . . . . . . . . 553.2 The Java version of BankAccount. . . . . . . . . . . . . . . . . . . 563.3 Feature matrix for different code reuse mechanisms. . . . . . . . . 623.4 Feature matrix for different code reuse mechanisms, continued. . . 633.5 Grammar for component relations. . . . . . . . . . . . . . . . . . . 643.6 The component relations of BankAccount. . . . . . . . . . . . . . . 653.7 Using renaming parameters. . . . . . . . . . . . . . . . . . . . . . . 673.8 Priority of renaming. . . . . . . . . . . . . . . . . . . . . . . . . . . 683.9 First-class component relations. . . . . . . . . . . . . . . . . . . . . 683.10 Indirect Inheritance. . . . . . . . . . . . . . . . . . . . . . . . . . . 703.11 Using indirectly inherited features. . . . . . . . . . . . . . . . . . . 70

ix

x LIST OF FIGURES

3.12 Selecting directly inherited features. . . . . . . . . . . . . . . . . . 703.13 Using component references. . . . . . . . . . . . . . . . . . . . . . . 723.14 Low-level dependencies. . . . . . . . . . . . . . . . . . . . . . . . . 723.15 High-level dependencies. . . . . . . . . . . . . . . . . . . . . . . . . 723.16 Component parameters. . . . . . . . . . . . . . . . . . . . . . . . . 743.17 Using high-level dependency resolution. . . . . . . . . . . . . . . . 743.18 Grammar for the subtyping relation. . . . . . . . . . . . . . . . . . 763.19 Subtyping relations. . . . . . . . . . . . . . . . . . . . . . . . . . . 763.20 Overriding components. . . . . . . . . . . . . . . . . . . . . . . . . 773.21 Overriding components. . . . . . . . . . . . . . . . . . . . . . . . . 783.22 Implementation of the banking application of Figure 3.1. . . . . . . 813.23 Lines of Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.24 Reduction Compared to Java. . . . . . . . . . . . . . . . . . . . . . 833.25 Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843.26 Methods of a class. . . . . . . . . . . . . . . . . . . . . . . . . . . . 853.27 Method lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 863.28 Method overriding. . . . . . . . . . . . . . . . . . . . . . . . . . . . 873.29 Method equivalence. . . . . . . . . . . . . . . . . . . . . . . . . . . 883.30 Expression typing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.31 Computation rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.32 Congruence rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903.33 The ComponentJ version of CTodoModular. . . . . . . . . . . . . . 943.34 Our version of CTodoModular. . . . . . . . . . . . . . . . . . . . . 943.35 Full-blown relationships. . . . . . . . . . . . . . . . . . . . . . . . . 95

4.1 Traditional architecture for programming tools. . . . . . . . . . . . 1044.2 The architecture of Chameleon. . . . . . . . . . . . . . . . . . . . . 1084.3 Overview of the Eclipse editor plugin. . . . . . . . . . . . . . . . . 1114.4 Design of connection between the editor and the metamodel. . . . 1124.5 Example connection between the editor and the metamodel. . . . . 1124.6 Code completion in our code editor. . . . . . . . . . . . . . . . . . 1144.7 Overview of an input module for Java. . . . . . . . . . . . . . . . . 1164.8 Extension of an input module. . . . . . . . . . . . . . . . . . . . . 1174.9 A support layer for reusing language constructs. . . . . . . . . . . 1194.10 Layers of abstraction. . . . . . . . . . . . . . . . . . . . . . . . . . 1224.11 The top-level class of model elements in Chameleon. . . . . . . . . 1224.12 Top-level structure of Eiffel, Java, and C#. . . . . . . . . . . . . . 1244.13 Strictly following the Java specification. . . . . . . . . . . . . . . . 1264.14 Mapping expressions to methods. . . . . . . . . . . . . . . . . . . . 1274.15 Attaching arbitrary metadata using tags. . . . . . . . . . . . . . . 1274.16 Placement of the rules. . . . . . . . . . . . . . . . . . . . . . . . . . 1304.17 Trading semantical strength for flexibility. . . . . . . . . . . . . . . 1314.18 Reification of crossreferences. . . . . . . . . . . . . . . . . . . . . . 132

LIST OF FIGURES xi

4.19 Placement of the lexical context objects. . . . . . . . . . . . . . . . 1334.20 Placement of the target context objects. . . . . . . . . . . . . . . . 1334.21 Performing a lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.22 Small-step evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . 1364.23 Big-step evaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . 1364.24 The four-layered architecture of MOF. . . . . . . . . . . . . . . . . 139

A.1 Dependency graph for Theorems A.8.8, A.9.5, and A.10.6. . . . . . 185A.2 Definition of τ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201A.3 Schema for final compile-time safety proof. . . . . . . . . . . . . . 208

B.1 Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210B.2 Subtyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211B.3 Subclassing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211B.4 Class Well-formedness. . . . . . . . . . . . . . . . . . . . . . . . . . 213B.5 Lookup of components. . . . . . . . . . . . . . . . . . . . . . . . . 214B.6 Component parameters. . . . . . . . . . . . . . . . . . . . . . . . . 215B.7 Component overriding. . . . . . . . . . . . . . . . . . . . . . . . . . 215B.8 Related component relations. . . . . . . . . . . . . . . . . . . . . . 216B.9 Component well-formedness. . . . . . . . . . . . . . . . . . . . . . . 216B.10 Fields of a class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218B.11 Field lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219B.12 Field type lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . . 219B.13 Field overriding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220B.14 Field equivalence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221B.15 Field well-formedness. . . . . . . . . . . . . . . . . . . . . . . . . . 222B.16 Methods of a class. . . . . . . . . . . . . . . . . . . . . . . . . . . . 225B.17 Method lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226B.18 Method type lookup. . . . . . . . . . . . . . . . . . . . . . . . . . . 226B.19 Method body lookup. . . . . . . . . . . . . . . . . . . . . . . . . . 226B.20 Method overriding. . . . . . . . . . . . . . . . . . . . . . . . . . . . 227B.21 Method equivalence. . . . . . . . . . . . . . . . . . . . . . . . . . . 228B.22 Method relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228B.23 Method well-formedness . . . . . . . . . . . . . . . . . . . . . . . . 229B.24 The abstract judgement. . . . . . . . . . . . . . . . . . . . . . . . . 229B.25 Expression typing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 231B.26 Computation Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . 232B.27 Congruence Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

xii LIST OF FIGURES

Part I

Introduction

1

Chapter 1

Introduction

Basic research is like shooting an arrow into the airand, where it lands, painting a target.

Homer Burton Adkins

1.1 The Evolution of Programming Languages

Ever since the early days of computing, the increasing need for software has trig-gered advances in programming languages. The more software is needed, the moreprogrammers are needed, so the easier a programming language must be to use. Inaddition, the increasing size of software systems makes programs so complex thatthey become unmanageable. This means that there is still a need for better pro-gramming languages, even though a new language can never expand the range ofpossible programs beyond that of existing Turing complete languages. But betterprogramming languages are needed to enable the creation of software that wouldotherwise be too complex to manage.

Many of these advances are made by creating programming constructs thatmake programs easier to understand, reduce the effort to create and maintain aprogram, or preferably both. Machine code, which was used in the early days ofcomputer science, was replaced by assembly language and subroutines were usedto reuse code. A mayor breakthrough was achieved by Grace Hopper [Sam92],who was convinced that programming would be a lot easier if the programminglanguage more closely resemble a natural language. With the uprise of structuredprogramming started by Dijkstra [Dij68], certain constructs, such as the famousgoto statement, were removed from languages, and methodological rules weredeveloped. While a minority programmers could deal with the unlimited freedomprogramming languages gave them, the resulting spaghetti code proved disastrous

3

4 Introduction

for big projects with many developers. In the same spirit, Parnas introduced thenotions of information hiding and modularity [Par02] to keep software systemsmanageable as they grow larger. Goodenough introduced the exception handlingmechanism [Goo75] to prevent abuse of the normal return value to signal errors,and prevent hardcoding of exception handlers. Dahl and Nygaard created thefirst object-oriented programming language by introducing classes of objects thatinherit properties from other classes [DN02]. Abstract data types were inventedby Liskov [LZ74], Guttag [Gut76], and others, to further increase modularity byimposing a separation between the specification and the implementation of datatype, and to focus the software development process on the data of the programinstead of its functions. Abstract data types were more tightly integrated withobject-oriented programming through behavioral subtyping by Liskov and Wing[LW93a], and Design by Contract by Meyer [Mey97].

It is important to realize that the usability of a programming language can-not be determined by just looking at the complexity of the language constructsthemselves. If that were the case, untyped lambda calculus [Bar88] would be theultimate programming language. A language must be judged by its influence onthe entire programming process. A programmer must not only know the pro-gramming language itself, but also a part of its standard library, design patterns,methodological rules, and workarounds for the limitations of the language.

Strongly related to this is the methodology behind the language constructs. Aconstruct that maps well to the natural way of thinking of a programmer will makethat programmer more productive than a construct that can technically achievethe same, but maps badly to his way of thinking. The major metaphors of theprogramming paradigm are especially important. It is crucial that a programmercan easily reason in terms of these metaphors in order to work efficiently.

In object-oriented programming, objects, messages, classification hierarchies,and contracts are important metaphors. The software system is seen as a popula-tion of objects that can exchange messages. Each object belongs to a class, alongwith other objects that have similar properties, which are described by the typeof that class. These classes are then placed in a specialization hierarchy. To hidethe implementation details of a method, a contract is provided, which states whatthe method requires and what it promises to deliver.

For us, these metaphors play a major role in the success of object-oriented pro-gramming. Not because of technical reasons, but because classification of objectsis very natural for people [Ros88]. We have been doing it all our lives, as evidencedby my little niece Kato who flawlessly detects and classifies anything that evenremotely resembles a chair or a table, to climb on it in the most dangerous waytheoretically possible. As such, the metaphor is a good tool for reasoning aboutsoftware systems that can be used by many people.

1.2 The Evolution of Programming Language Development 5

1.2 The Evolution of Programming Language De-

velopment

The development process for programming languages has also evolved. As com-puters were initially programmed using machine code, the programming languagewas defined by the processing unit of the machine. Grace Hopper invented thecompiler in the early 50s, changing the way programming languages were created.They were no longer defined by the hardware, but by the software. John Backuscreated the Backus Normal Form notation (BNF) [BBG+60], first used for AL-GOL, to define context-free grammars. Most grammars today are defined using avariant of BNF. Brooker and Morris created the first compiler-compiler [BM62],which is a program that generates source code for a compiler. Today, such pro-grams are used mainly in the form of parser generators. A parser generator takesa grammar as input, and produces the source code for a parser for the languagedescribed by the input grammar. The abstract syntax tree created by the parseris used by the compiler to create machine code for a program. Efforts like Polyglot[NCM03], JastAdd [HM03], and JaCo [ZO01a] have simplified the construction ofsuch compilers by structuring the compiler algorithms in such a way that they canbe extended easily. These efforts have significantly reduced the amount of workrequired to develop a new programming language.

Successfully developing a programming language, however, has not become eas-ier. Today, the development of a programming language is not yet done when thelanguage specification and a compiler are written. Most programmers are used tointegrated development environments that offer advanced features such as refac-torings, hyperlinks, class browsers, outline views, error reports, quick fixes, . . . .In addition, programmers are used to having a large standard library at their dis-posal. This means that the developer of a new programming language must do anenormous effort to attract users.

1.3 Goal

A first goal of this thesis is to further increase the reusability of object-orientedprograms. A line of code that can be reused, must not be written, cannot introducenew bugs, and does not make the system harder to understand.

The methodological principles that have been developed over the years, manyof which are mentioned above, play an important role in the language constructswe have created. Every language construct in this thesis was carefully designedto respect these principles. The language constructs must preferably represent ametaphor that is easily understood by regular programmers. A language that canonly be used by a few people is of little use to the mainstream software industry.

But methodological soundness alone is not enough. A language construct mustalso be effective. Given the focus on reuse, this means that the constructs de-

6 Introduction

veloped in this thesis must allow as much reuse as possible with as little effortas possible. It is not acceptable if a programmer must resort to writing low-levelworkarounds if there is no good reason why a language construct should not dothe job for him. In addition, the constructs must be able to deal with unantici-pated extensions and reuse. History has shown time and time again that softwareis reused in unanticipated ways, so the programming constructs must be able tocope with almost any kind of reuse that makes sense.

A second goal of this thesis is to increase the reusability of programming lan-guages and programming tools. It must be possible to write programming toolsthat are reusable across programming language without requiring any modifica-tions.

1.4 Overview

This section gives an overview of the topics presented in the remainder of thisthesis, and motivates our approach.

1.4.1 Exception handling

The first part of this thesis concerns exception handling. The exception handlingmechanism was invented by Goodenough in 1975 [Goo75]. A method can signalthe occurrence of an error by throwing an exception to the caller. The exceptionproceeds up the call chain until it is handled. One reason for introducing exceptionhandling is to prevent the abuse of the normal return value to signal errors, whichis confusing and not always possible. Another reason is to increase the reusabilityof the software. By removing the specific exception handlers from a piece of code,it can be reused when the error must be dealt with in a different way.

Exceptions can be divided in two categories: checked exceptions and uncheckedexceptions. A checked exception must always be listed in the signature of themethod if it can be signalled, an unchecked exception must not be listed. Theadvantage of checked exceptions is the increase in the robustness of the program.Because the programmer is forced to explicitly handle or propagate the exception,it is unlikely that unanticipated exceptions occur at run-time. With unchecked ex-ceptions, it is more likely that an exception handler is missing, and that the faultysoftware gets through the testing phase. The disadvantage of checked exceptionsis that it requires a lot of work to write the exceptional specification of all meth-ods. Here, unchecked exceptions have the upper hand, as they do not require anyadditional work.

So far, it seems like a reasonable trade-off: more robustness requires morework. But the balance in this trade-off is skewed in favor of unchecked exceptions,because checked exceptions also require a lot of work that is basically useless. Ifthe exceptional specification of a method is modified, it is necessary to modify

1.4 Overview 7

other methods along the call chain, even if these methods have no influence on thehandling of the exceptions. In addition, it is often necessary to handle an exceptioneven if it cannot be thrown.

Goodenough originally did not make this distinction, and only used checkedexceptions. The lack of flexibility of checked exceptions, however, gave rise to theintroduction of unchecked exceptions. The issue has been heavily debated eversince, but in those 30 years, nothing has been done to actually improve checkedexceptions.

In this thesis, we show that the copy & paste mechanism for specifying theexceptional behavior of a method does not fit in modern programming languages.Exceptions can only be declared in an absolute manner, while the implementationand the normal specification can be written relative to other methods. We solvethis problem by introducing anchored exception declarations, which allow the ex-ceptional behavior of a method to be specified relative to other methods. As such,they can automatically propagate changes in the exceptional behavior of a pro-gram across methods that are not involved in the exception handling. In addition,the relative nature allows call-site information to be exploited to reduce the setof exceptions that can be signalled by a method call. In summary, a method isallowed to blame another method if something goes wrong, which is definitely anatural metaphor for humans.

This work is published in a paper at the OOPSLA 2005 conference underthe title “Combining the robustness of checked exceptions with the flexibility ofunchecked exceptions using anchored exception declarations” [vDS05a].

1.4.2 Composition of Abstract Data Types

The second part of this thesis introduces composition of abstract data types. Theinheritance mechanism is the most central part of an object-oriented programminglanguage. It has a profound influence on almost every other feature of the language.It has a major influence on which designs can be written using the constructs ofthe language as they were intended, and which designs must be written using low-level workarounds. Currently, the three major kinds of inheritance mechanisms aresingle inheritance, linearized inheritance, and multiple inheritance1.

With single inheritance, each class has a single superclass, except for the top-level class. As a result, a class cannot reuse code from two unrelated classes, evenif that makes perfect sense. In case of a statically typed language, interfaces areusually introduced to allow multiple inheritance in the type hierarchy. They candeclare method signatures, but cannot provide implementations. It is often claimedthat single inheritance is more simple than multiple inheritance, but we stronglydisagree. In both cases, the type hierarchy is equally simple or complex, since the

1The term “multiple inheritance” refers to a multiple inheritance mechanism where there isno order among the super classes.

8 Introduction

type hierarchy of single inheritance language also uses multiple inheritance. Inboth cases, name conflicts can occur. The major difference is that with multipleinheritance, you can write all methods in the class where they belong, while youmust sometimes duplicate them if you can only use single inheritance. In addition,it can be quite a struggle to determine which class to inherit from, to minimizethe amount of code duplication.

Linearized inheritance mechanisms allow multiple inheritance, but they lin-earize the inheritance hierarchy. As such, repeated inheritance is impossible. Thepurpose of this approach is to solve name conflicts automatically. By linearizingthe hierarchy, a total order is created among methods, which allows conflicts tobe resolved automatically if the signatures of the participating methods are com-patible. If they are not compatible, a compile error is generated. The approachhas a number of serious methodological problems. First, it is very difficult for aprogrammer to determine the order of the classes in the linearization, which is re-quired to understand which methods will be invoked. Some of these linearizationsare simply mind-boggling. Second, with some mechanisms, the relative order oftwo classes in the linearization can change throughout the hierarchy. Such mecha-nisms are not monotone. Third, the automatic conflict resolution allows a methodto override a completely unrelated method. As a result, the clients of the latterwill invoke the wrong method. The lack of a sensible conflict resolution mechanismand the difficulty of understanding linearized hierarchies make linearized inheri-tance mechanisms unsuitable for the creation of large software systems by regulardevelopment teams.

With multiple inheritance, a class can inherit from multiple classes and thereis no order among the superclasses. As a result, name conflicts are either forbid-den, or must be resolved manually by renaming the involved methods and fields.Duplication of methods within the class hierarchy is not needed because they canalways be reused.

Repeated inheritance occurs when a class inherits from a class via multiplepaths in the inheritance hierarchy. Two examples are shown in Figure 1.1. Inthe left example, class BankAccount inherits twice from class Association viatwo different paths to incorporate associations with a bank and a bank card.Inheritance is used here for code reuse. In the right example, class AmphibiousCarinherits twice from class Vehicle. In this case, inheritance is used for subtyping.The most important decision in case of possible repeated inheritance is whether ornot the inherited methods and variables are inherited once, or multiple times. InCecil [Cha92] and Diesel [Cha98], they can only be inherited once, while in Eiffel[oTC05] they can optionally be inherited more than once.

It is often argued that repeated inheritance is confusing and causes problems ifthe inheritance hierarchy is modified [Sny86], [Bra92]. The problem is that in theseapproaches, “the” inheritance relation of a language is used for pure code inheri-tance. But that same inheritance mechanism is also used for building classification

1.4 Overview 9

Figure 1.1: Diamond inheritance.

hierarchies, which in some situations conflicts with code reuse. For example, it isargued that it should be allowed to change class B not to inherit from A, but fromE if it provides the required methods and fields. That is reasonable if inheritanceis used purely for code reuse. But it is not reasonable to expect a program still tocompile and work correctly if for example a Car is no longer a Vehicle. Similarly,duplication of features for code reuse is not a problem, but it is strange for subtyp-ing, as only one version can be used through polymorphism. The arguments arebased on the assumption that there is only one inheritance relation; an assumptionthat no longer holds.

It has already been argued that a single inheritance relation is not sufficient.In existing approaches such as traits [SDNB03] and multiple inheritance in Cecil,Eiffel, and SmartEiffel [CRA+05], there are separate inheritance relations for in-heritance with subtyping, and pure code inheritance. But aside from the lack of asubtyping relation in case of code inheritance, both relations in these approachesare nearly identical. The code reuse that is envisioned in these approaches is thereuse of application specific methods that can be shared between classes that arenot related through a type relation.

But there is a more important kind of reuse that can be achieved with a separatecode inheritance relation. Classes often contain general purpose characteristics likeassociations, constrained values, lockable values, event listener infrastructure, . . . .They are part of the abstract data type formed by the class, and their methodsare given names that are meaningful in the context of that abstract data type.All these characteristics are high-level concepts that are being coded over andover again. If these concepts can be captured in a class, and used conveniently ascomponents to build other classes, a lot more code can be reused. And while theinheritance mechanisms of Eiffel and SmartEiffel can technically be used to createand reuse such components, it requires a lot of low-level code because they are nottailored for this kind of reuse.

10 Introduction

There are existing approaches that focus on the composition of components,such as Scala, CaesarJ, and ComponentJ, but they cannot be used in generalto create a composition that results in an abstract data type. In general, theycan only create shallow compositions, where the combination as such is just acontainer for the components. In Scala and CaesarJ, an abstract data type can becreated if no renaming or conflict resolution must be performed. In ComponentJ,the composition of an abstract data type is simply impossible.

A requirements analysis shows that the abstractions offered by existing ap-proaches are not powerful enough to use classes as components for other classes.In this thesis we introduce a multiple inheritance mechanism with two relations,each focused on a different metaphor: one for classification and one for compo-sition. They are each fine-tuned to perform their task as good and efficient aspossible. The first relation uses the standard combination of subtyping and codeinheritance and is used to build a classification hierarchy. The second relation –the component relation – is tailored for building a class using other classes ascomponents. It contains new features such as renaming parameters, first-class in-heritance relations, component parameters, and indirect inheritance to deal withthe components of a class on a higher level of abstraction. A case study shows thatthis inheritance mechanism can significantly decrease the size of an application. Itreduced the application size by 21% to 36%, whereas other inheritance mechanismsobtained 3% to 12%, and the delegation pattern 11% to 17%.

This work is published in a paper at the ECOOP 2007 conference under thetitle “A Higher Abstraction Level using First-Class Inheritance Relations”.

1.4.3 Programming Language Development

In the third part of this thesis, we present a framework to improve the developmentof programming tools and programming languages.

Programming tools, such as compilers, code formatters, and refactoring toolstypically operate on an abstract syntax tree (AST) of the program. First, a lexertransforms the source code of the program into token streams. Then, a parseradds structure to these streams by turning them into ASTs. Because ASTs arelittle more than a structural representation of the source code of a program, thesemantics of the programming language is incorporated into the programmingtools, causing several problems.

First, this approach leads to large-scale duplication of the language semanticsin the different tools. Every programming tool duplicates at least part of thesemantics of a language to perform its task. As a result, these tools often containbugs because the semantics are implemented from scratch, and it takes at lot ofwork to create a programming tool.

Second, programming tools are tied to a fixed number of languages – usuallyone language. Because the language semantics must be duplicated, tools are usuallywritten for a specific programming language, even if its task is not dependent on

1.4 Overview 11

the specific language. For example, a refactoring that renames a class, method,or variable depends on the used programming language because it must know ifthe new name is valid, and what cross-references reference the renamed elementin order to update them as well.

A metamodel based approach such as MOF [OMG06] alleviates the duplicationproblem by incorporating a part of the language semantics in the data model. Butsince MOF tools generate limited verification code based on the specifications inthe language model, the metamodel implementations can only partially validate amodel. In addition, the language defined by the M3 level of MOF, which is used todefine metamodels, is not practical for creating frameworks of metamodels becauseits inheritance mechanism does not allow methods to be overridden. As a result,level M3 is the only common denominator, and thus MOF tools can only performbasic operations on programs.

Based on our experience with the language constructs developed in this the-sis, we developed the Chameleon framework for metamodels of programming lan-guages. It has a concrete implementation for object-oriented programming lan-guages, but it can be extended to support any family of programming languages.To satisfy the needs of different kinds of tools, the framework provides differentlayers of abstraction. The top layer contains elements that are present in everyprogramming language, while more specific layers focus on a particular family oflanguages. As a result, programming tools can be written for any language, suchas an advanced code editor, or for a particular family of programming languages,such as a graphical editor for class diagrams.

While Chameleon is still work in progress, and will likely never be completedue to the ever changing variety of programming languages and constructs, itshows that the creation of language independent programming tools is possible.There are concrete language modules for Java 1.2, C# 1.0, and Cappuccino, whichextends Java with anchored exception declarations, and a language module forRuby is under construction. The most mature tool we have developed is a languageindependent code editor that offers advanced functionality like syntax highlighting,code folding, hyperlinks, an outline view, detection of syntax errors, and codecompletion.

12 Introduction

Part II

Programming Language

Constructs

13

Chapter 2

Anchored Exception

Declarations

The most exciting phrase to hear in science, the one that heralds newdiscoveries, is not ”Eureka!” but rather ”hmm....that’s funny...”

Isaac Asimov

2.1 Introduction

The common way of dealing with exceptional conditions in object-oriented softwareis the use of an exception handling mechanism. When an exceptional conditionis detected by a component, it raises an exception and signals it to the caller.The caller can then handle the exception in a context dependent manner. Thisway, the reusability of a component is improved by removing the specific logicfor handling the abnormal condition from that component. In addition, exceptionhandling mechanisms force a separation of normal code and exception handlingcode, resulting in programs that are easier to understand.

Exceptions can be divided into two categories: checked exceptions andunchecked exceptions. Checked exceptions must be propagated explicitly by listingthem in the method header, while unchecked exceptions are propagated implicitly.Note that this is different from the categorization caught -uncaught, which denoteswhether an exception can exit a method body.

Checked exceptions improve the robustness of software [Goo75, MT97, MR01].Because every checked exception that can be signalled during the execution ofa method must either be listed in the exception clause – the throws clause inJava – of that method or handled in its body, it is impossible to encounter anunanticipated checked exception at run-time. The programmer is forced to make a

15

16 Anchored Exception Declarations

decision for every checked exception, so he can be reasonably sure that all checkedexceptions are handled properly.

As the software evolves, checked exceptions may need to be added to or re-moved from existing methods [MR01, MT97]. The compiler will reject all methodsthat can encounter newly added checked exceptions, but do not deal with them.Outdated exception handlers for checked exceptions that cannot be signalled any-more will also be rejected, keeping the source code clean. Consequently, all affectedmethods must be modified manually by the programmer.

Unfortunately, checked exceptions also cause problems. First of all, they de-crease the adaptability of software. Modifying the exception clause of a methodcan trigger modifications along every call chain encountering that method. An-other problem is the lack of context information available to the exception han-dling mechanism. Often, a programmer knows that a certain checked exceptioncannot be signalled by a particular method invocation, but the exception handlingmechanism does not. This leads to the addition of dummy exception handlers thatare inconvenient and dangerous for the evolution of the program.

In this chapter, we track down the problems with checked exceptions to a lack ofexpressiveness of the exception clause. We then add the required expressiveness byintroducing anchored exception declarations to provide a relative means to declarethe exceptional behavior of a method besides traditional absolute declarations.Anchored exception declarations will be presented as an extension to ClassicJava[FKF98]. The mechanism itself, however, is not specific to the Java programminglanguage, nor to any particular exception handling mechanism.

Overview

In Section 2.2, we present the root of the problems with checked exceptions. Weintroduce anchored exception declarations in Section 2.3. In Section 2.6.1, we showthat anchored exception declarations do not violate the principle of informationhiding when used properly. In Section 2.6.2, we discuss which modifications ofsource code are good, and which are not. We compare anchored exception decla-rations with Eiffel type anchors in Section 2.7, and present a case study in Section2.4. In Section 2.5, we present the formal semantics, along with the necessary rulesto ensure compile-time safety, followed by an overview of the proof of soundness.The translation to Java is presented in Section 2.8. We discuss related work inSection 2.9, and we conclude in Section 2.10.

2.2 Copy & Paste for Exceptions

The root of the problems with checked exceptions is the lack of expressiveness ofthe exception clause of a method. Alexander Romanovsky and Bo Sanden [RS01]

2.2 Copy & Paste for Exceptions 17

public interface Strategy public Result compute() throws Exception;

/*@ \result == ...strategy.compute()...@*/public Result template(Strategy strategy) throws Exception

...

...strategy.compute()...;

...

public class MyStrategy implements Strategypublic Result compute() throws MyException ...

Figure 2.1: Copy & paste for checked exceptions.

argue that “exception handling mechanisms should correspond to the features1 thelanguage provides”. But this is clearly not the case for the exception clause of amethod. A method can delegate normal behavior and the detection and raising ofexceptions to another method. If the latter is modified, the former automaticallyreflects the changes. Most design patterns are based on this principle [GHJV95].But this is not so for the specification of the exceptional behavior. Current excep-tion clauses dictate a copy & paste programming style for the exceptional interfaceof a method.

Figure 2.1 shows the interface of a strategy pattern [GHJV95] with a singlemethod that declares that it can signal any exception to maximize reusability. Thetemplatemethod uses a Strategy object to perform a more complex computation.The application contains a specific implementation of the strategy, MyStrategy,that can signal only the checked exception MyException. Both the specification,written in JML [LBR00], and the implementation of template delegate a partof their work to the strategy object. The exception clause of template, however,cannot do this; it must copy the exception clause of compute and thus declare thatit can always signal Exception.

Figure 2.2 gives an abstract illustration of the problem. Figure 2.2a illustratesthe situation for the normal behavior. The implementation and the specificationof the delegator method delegate a part of their work to the implementation andspecification of the legate method. The composition is made by a client, who thusknows the specific delegator and legate. The delegations are represented by thesolid arrows in the figure. As a result of this delegation, the behavior and specifi-cation of the legate method are incorporated into the delegator method. This effect

1This refers to the language concepts, not the features of a class.


(a) Behavior and normal spec-ification

(b) Exceptional specification

Figure 2.2: The lack of expressiveness of the exceptional interface of amethod.

is represented by the dashed arrows. Because the delegation in the specification isvisible to the client of the delegator, he can incorporate the specific behavior ofthe specific legate into the contract of the delegator method. The delegation in theimplementation ensures that the client is allowed to do this, although the compilerwill not verify it – that requires a program verifier. Figure 2.2b illustrates the samefor the exceptional behavior of both methods. The delegator signals E1 directlyand propagates E2, which is signalled by the legate. The implementation will alsoincorporate the specific exceptional behavior of the specific legate method. In thiscase, however, the delegation is not visible to the client because the delegatorspecifies that E2 can be signalled at any time. Both the client and the compilercannot be sure that E2 is signalled only by the legate. Because the compiler doesverify the exceptional specification of a method, it forces the delegator to specifyE2, and all clients of the delegator to handle it.

2.2.1 Reduced Adaptability

The best-known problem with checked exceptions is a decrease of the adaptabil-ity of software [Goo75, MR01, Don90]. Adding new exceptions to and removingexceptions from a program is a natural consequence of software evolution, eithercaused by the addition of functionality [MR01, Don90], or by the difficulties inpredicting all exceptional conditions in advance [RM00]. Such changes, however,cause a ripple effect along every call chain involving the modified method. Ev-ery method in such a call chain that propagates the new exception must also bemodified.

In his paper introducing exception handling [Goo75], Goodenough already men-tions this effect, and argues that it is “not entirely wasted effort”. Indeed, it willreveal all methods requiring modification for dealing with the new exception, thusincreasing robustness as argued in the previous section. But while not entirelywasted, the effort mostly is.

Usually, methods that do not handle exceptions, but only propagate them,

2.2 Copy & Paste for Exceptions 19

public Result client(MyStrategy myStrategy) throws MyException try

...

...template(myStrategy)...;

...catch(RuntimeException exc)

throw exc;catch(MyException exc)

throw exc;catch(Exception exc)

throw new Error();

Figure 2.3: Problems with checked exceptions.

will also propagate the newly added exception, as illustrated by our case study inSection 2.4. Such methods provide a certain functionality, but are unable to handleexceptions. So their behavior has not really changed by adding a new checkedexception to their exception clause; they still do the same work and report allfailures. While changes in the implementation and the postconditions of a methodare propagated automatically, changes in the exception clause of a method mustbe propagated manually. In Section 2.6.2, we will discuss which code modificationsare good and which are gratuitous.

As a result of the increased cost, programmers often switch to unchecked excep-tions [RM03], leaving room for unanticipated exceptions at run-time. This happensto such an extent that programming languages such as C# [Hej] and Eiffel com-pletely omit exception clauses, or do not enforce safety at compile-time such asC++.

The example in Figure 2.3 illustrates the consequences of adding a checkedexception to existing software; it does not show the need for evolution, which ispresented in [MR01, MT97]. The figure shows a client of the template method ofFigure 2.1. The client method takes a specific strategy as an argument, passesit to the template method somewhere in its body, and propagates all exceptions.

Suppose now that in a later version of the application, the specific strategyshould signal an additional checked exception YourException. Not only do wehave to modify the compute method of MyStrategy, but we must also modify theclient method although its behavior has not changed. As we illustrate in our casestudy in Section 2.4, this can add up to a considerable amount of work.


Note that the use of abstract exceptions alleviates the problem, but it is gen-erally considered bad practice to handle abstract exceptions because the compilerwill not warn the programmer if he forgets to handle YourException. The abstractexception conflicts with the need for complete exception specification [MT97]. Theuse of Exception in the Strategy interface is an example of this, as shown in thenext section.

2.2.2 Loss of Context Information

A programmer often knows that certain checked exceptions cannot be signalledby a method invocation when delegation is used. If he knows the type of theconcrete legate that will be used by the delegator, he can eliminate certain checkedexceptions based on the exception clause of the legate. The exception handlingmechanism, however, cannot make the same deduction because the delegator hidesthe exception clause of the legate.

Consider again the example in Figure 2.3. Even though the programmer knowsthat the concrete strategy signals only MyException, he must provide an exceptionhandler for Exception. He cannot use the context information about the concretestrategy to exclude Exception. If MyException is handled, only the handlers forRuntimeException and Exception are useless. But if MyException is propagatedas in the example, this is a very long way to write template(myStrategy).

As the application evolves, the inconvenient situation turns into a dangerousone. Suppose, as before, that MyStrategy now signals YourException. The as-sumption under which the dummy exception handler for Exception was valid,no longer holds: MyException is no longer the only checked exception that canbe signalled. But unless the entire program is manually verified, such invalid ex-ception handlers will not be detected, and exceptions will disappear at run-time.The problem can be alleviated by raising an unchecked exception in the dummyexception handler, but this exception can be detected only at run-time, and canthus slip unnoticed through the testing phase.

2.3 Anchored Exception Declarations

Eiffel has a concept called type anchoring [Mey97], to declare the type of an entityrelative to the type of another entity, the anchor, within its scope. If the typeof the anchor is changed, the type of the other entity automatically follows thechange.

In this section, we use the anchoring technique to make exception clauses moreexpressive. We extend the exception clause of a method to specify not only whatexceptions can be signalled, but also where they originate from.

The addition of a new concept for specifying the exceptional behavior of amethod requires an extension of the terminology. An exception clause will no longer

2.3 Anchored Exception Declarations 21

ExceptionClause:throws ExceptionDeclaration ( , ExceptionDeclaration)*

ExceptionDeclaration:AbsoluteExceptionDeclarationAnchoredExceptionDeclaration

AbsoluteExceptionDeclaration:Identifier

AnchoredExceptionDeclaration:like MethodExpression [FilterClause]

FilterClause:(propagating ( ExceptionList ))?

(blocking ( ExceptionList ))?ExceptionList:

Identifier ( , ExceptionList)*MethodExpression:

MethodInvocation allowing type names as expressions

Figure 2.4: A grammar for anchored exception declarations.

be a list of exception types, but a list of exception declarations. Each exceptiondeclaration declares that certain checked exceptions can be signalled under certaincircumstances. The exception types in traditional exception clauses will be calledabsolute exception declarations from now on. They declare that a certain type ofexceptions can always be signalled.

Instead of always declaring signalled exceptions in an absolute manner, a pro-grammer can also declare them relative to another method using anchored ex-ception declarations. An anchored exception declaration automatically reflectschanges in the exception clause of its anchor.

An anchored exception declaration consists of the keyword like, followed bya method expression and optionally a filter clause, as shown by the grammarin Figure 2.4. The method expression determines to which method the anchoredexception declaration is anchored, and thus the set of exceptions that can besignalled as a result of that exception declaration. The filter clause can narrow thisset by allowing only a fixed set of exceptions to be propagated using a propagating

filter, or by allowing everything to be propagated except for a fixed set of exceptionsusing a blocking filter.

A method expression can be any method invocation that is valid in the contextof the method header, including the formal parameters of the method. On topof that, type names can be used as expressions because some subexpressions ofthe method invocation may not always be visible outside the method body, orthe programmer may want to hide them. The type name avoids ambiguities inpresence of syntactic overloading [Mey01].


void f() throws like g(), like h(x);

void f(A a) throws like a.g() propagating (E1, E2);

void f() throws like b().g(x) blocking (E1, E2);

void f() throws like A.g(X) propagating (E0) blocking (E1, E2);

Figure 2.5: Anchored exception declarations read like a sentence.

The filter clause allows the developer to propagate only a limited set of excep-tions using the propagating keyword, to propagate everything except for a setof exceptions using the blocking keyword, or a combination of both. The defaultfilter clause – no filter clause – allows all exceptions of the anchor to be propagated.

Figure 2.5 shows a few anchored exception declarations. The syntax is chosensuch that an anchored exception declaration reads like a sentence. The syntaxmay seem verbose, but in practice most anchored exception declarations will nothave a filter clause. Also, the anchored exception declaration replaces one or moreabsolute declarations, and eliminates most of the dummy exception handlers likethose in Figure 2.3.

2.3.1 Informal Semantics and Rules

We now informally describe the semantics of anchored exception declarations.An anchored exception declaration declares that the parent method can signal

any checked exception that can be signalled by the method it references. Checkedexceptions that are blocked by the blocking clause, or not propagated by thepropagating clause, cannot be signalled according to that anchored declaration.The meaning of an anchored exception declaration depends on the context in whichthe parent method is invoked. Because some elements of an anchored declaration,such as references to formal parameters, can have a more specific type for a givenmethod invocation, it can reference different methods for different call-sites. Byinserting more specific type information, the anchored declaration may select amore specific method than the one it originally referenced. As a result, the set ofsignalled checked exceptions may be reduced. Unchecked exceptions are ignored.

Call-site information is inserted by substituting this and the formal parametersby the target and the actual arguments of the actual method invocation t.m(args)

in the referenced exception clause. As explained above, the resulting exceptionclause (EC1) may allow less exceptions than the exception clause of the invokedmethod (m). In order to compute the set of checked exceptions than can be signalledby t.m(args), we repeat this process for every anchored exception declaration inthe computed exception clause (EC1). We insert its target and actual arguments

2.3 Anchored Exception Declarations 23

into the exception clause (EC2) of the referenced method, which is possibly amore specific method that the one referenced by the original anchored declaration.After that, we apply the filter clause of the anchored declaration (FC1) to EC2, byremoving filtered absolute declaration and merging FC1 with the filter clausesof the anchored exception declarations of EC2. Finally, we replace the anchoreddeclaration in EC1 with EC2. These steps must be repeated until only absolutedeclarations – exception types – are left. They form the set of checked exceptionsthat can be signalled by the method invocation t.m(args). This process, calledrecursive expansion, will be illustrated in the example in Section 2.3.2. A singlestep in the process is called expansion. To ensure that the recursive expansion willend, the analysis stops if a method has already been processed before with a targetand arguments of the same type.

To ensure compile-time safety, we must add an additional rule to determinewhether an exception clause is stronger than another one. We explain this ruleinformally in the next paragraph; the details are presented in Section 2.5.4. Thisrule must be applied to check whether the exception clause of a method conformsto the exception clauses of the methods it overrides, and to check whether theimplementation of a method conforms to its exception clause. For the latter check,we will derive an exception clause from the implementation representing its worst-case behavior. This rule will be illustrated in the example in Section 2.3.2.

We now describe which differences are allowed between ECa and ECb such thatECa still conforms to ECb. First of all, exception declarations may be removed.Second, an absolute exception declaration – a checked exception type – may bereplaced by one or more of its subtypes. Third, an anchored exception declarationanchorb may be replaced by one or more anchored exception declarations thatwill always select the same method as anchorb, or a method that overrides it.Finally, one or more declarations of ECb may be replaced by an anchored exceptiondeclaration anchora if the expansion of anchora conforms to ECb. The latterchange allows the delegation of part of the exceptional behavior to another methodif that does not introduce additional checked exceptions.

2.3.2 Example

We now illustrate the use of anchored exception declarations for the exampleof Section 2.2. Remember there were two problems with the example. We hadto modify the template method when adding YourException to the exceptionclause of MyStrategy.compute, and we had to provide a dummy exception han-dler for Exception. Figure 2.6 shows the same example using anchored exceptiondeclarations. We have now expressed that changes in the exceptional behavior ofstrategy.compute and myStrategy.compute will always be reflected in the setof exceptions signalled by the methods template and client respectively. Conse-quently, the addition of YourException to MyStrategy.compute will not requirethe modification of the client method.


public Result template(Strategy strategy)throws like strategy.compute()

...

...strategy.compute()...;

...

public Result client(MyStrategy myStrategy)throws like myStrategy.compute()

...

...template(myStrategy)...;

...

Figure 2.6: The example using anchored exception declarations.

We now illustrate the expansion process for the method invocation in the bodyof the client method. The invocation of template has myStrategy as actualargument with static type MyStrategy. If we insert this call-site information inthe exception clause of template by substituting the formal parameter, we getthe following exception clause:

like myStrategy.compute()

To obtain the set of exceptions that is declared by this calculated excep-tion clause, we repeat the same process until only absolute exception decla-rations remain. In this case, there is one more step. The exception clause ofMyStrategy.compute contains only an absolute declaration:

MyException

By inserting the static type information of the call-site, we made the methodexpression in the exception clause of the template method select a more specificmethod than it referenced originally. As a result, we limited the set of checkedexceptions from Exception to MyException.

2.3.3 Type Parameters

A part of the effect of anchored exception declarations can be obtained by us-ing type parameters as exception types. Instead of using an anchored exceptiondeclaration, a programmer could use a type parameter that is restricted to ex-ception types, e.g. PARAM extends Exception in Java. This approach, however,is not nearly as elegant and flexible as using anchored exception declarations. Theaddition of type parameters for exception handling clutters the code since they

2.4 Case Study 25

will appear everywhere in the static typing of the program. On top of that, thenumber of types of checked exceptions that a method can signal cannot exceed thenumber of type parameters in its exception clause. As a result, the programmercould be forced to introduce new abstract exception types and provide wrappersfor existing checked exceptions in order to get his code to compile. We also believethat the approach does not work in practice because it is only useful when youdo not know the specific type of an object, but you do know what exceptions itsmethods can signal, which is definitely not a common situation. Finally, using thisapproach, the exceptional behavior of a method is fixed at the construction timeof an object, whereas an anchored exception declaration can exploit all static typeinformation of every method invocation.

Type based approaches to analyze the exception flow in a program also usetype variables, but there they serve the same purpose as anchored exception dec-larations, whereas type parameters in Java would be absolute declarations that arerarely useful. The type based approaches are discussed in more detail in Section2.9.

2.4 Case Study

We used Jnome [vD06], our metamodel for Java consisting of 14,000 lines ofsource code and 250 classes to analyze its own source code. In this code, aMetamodelException signals an unexpected failure while looking up a named el-ement, such as encountering multiple matches for an element when only a singleelement should match – the metamodel does not enforce validity at every mo-ment for reasons of flexibility. We observed that only 46 methods directly raise aMetamodelException, whereas more than 400 methods propagate this exception.No method can actually handle the exception.

When an element simply cannot be found, a null reference is returned. Supposenow that instead of returning a null reference, a checked exception must be sig-nalled when an element cannot be found. This exception cannot be a subclass ofMetamodelException since it does not signal a failure caused by an invalid instanceof the metamodel. Just like a MetamodelException, the new exception cannot behandled by the metamodel itself, since it is up to the client of the metamodel todecide what to do when an element cannot be found. This means that all meth-ods that previously propagated MetamodelException must now also propagateElementNotFoundException, resulting in the modification of over 400 methods.With anchored exception declarations, they would not have to be modified.

From the 110 try-catch statements in the code, only 30 actually handle ex-ceptions. The other 80 try-catch statements are dummy constructions to filterout Exception when Strategy and Template patterns are used. Using anchoredexception declarations, the dummy exception handlers can be removed.


2.5 Formal Semantics and Rules

In this section, we present the formal semantics of anchored exception declara-tions. They provide a precise description of the meaning of anchored exceptiondeclarations, and allow us to prove that compile-time safety is not violated.

To simplify the formal semantics and the proof of correctness, we put somerestrictions on the underlying programming language. We use a variant of Clas-sicJava [FKF98] where the throws clause and the statements for throwing andhandling exceptions are added again. We limit expressions to this, references toformal parameters and fields, type names, constructor invocations, and methodinvocations. In an anchored exception declaration, a type name can be used as anexpression. Other expressions have been omitted for reasons of brevity, but caneasily be added. A class may not introduce a field with the same name as a field ofone of its superclasses in order to simplify the lookup after substituting parameters,nor may it overload constructors based on the types of the formal arguments. Formethods, ClassicJava already takes care of this by forbidding syntactic overloading[Mey01].

2.5.1 Formal notation

We now define a shorter notation for exception declarations for use in formulas.Exception lists will be represented by sets of types. The E operator denotes thata type is a subtype of an element of a such a set, and can be thought of as the ∈operator for normal sets. The ⊓,⊔, and − operator correspond to the ∩, ∪, and \operators on normal sets, and the ⊑ relation corresponds to the ⊆ relation. Thesymbol ⊤ represents a set containing every type. The definitions of the operationsare shown in figure 2.7. Note that this definition of ∩ is more strict than necessaryin case of multiple inheritance, where ∅ is returned in case of two independenttypes, while there may be types that are subtypes of both Typea and Typeb.

An absolute exception declaration is represented by a pair of sets of types:(P, B). The first set contains the types of exceptions that can be signalled, while thesecond set contains the types that are blocked. An absolute exception declarationE in a program is then represented by (E, ∅). The second element of the pairwill be non-empty for intermediate results during the expansion process, which ispresented in Section 2.5.3.

An anchored exception declaration like t.m(args) propagating (P)

blocking (B), where P and B are exception lists, is denoted as like t.m(args) E

P 6E B, where P and B are sets of exception types. The default values for P andB are ⊤, which contains every type, and ∅.

An exception clause is denoted as a set of exception declarations.

2.5 Formal Semantics and Rules 27

Si and Ti are types

S and T are sets of types

Ta E T ⇔ ∃ Tb ∈ T : Ta <: Tb

Ta 6E T ⇔ ¬ Ta E T

T ⊔ S = T1, . . . , Tn, S1, . . . , Sm

T ⊓ S = (T1 ∩ S1), . . . , (T1 ∩ Sm), . . . , (Tn ∩ S1), . . . , (Tn ∩ Sm)

Ta ∩ Tb =

Ta if Ta <: Tb

Tb if Tb <: Ta

∅ otherwise

T − S = Type t | t E T ∧ t 6E S

T ⊑ S ⇔ ∀ x E T : X E S

T1, . . . , Tn ⊖ S1, . . . , Sm =⋃i=n

i=1(Ti ⊖ S1, . . . , Sm)

Ta ⊖ S1, . . . , Sn =

∅ if ∃ Tb ∈ S1, . . . , Sn : Ta <: Tb

Ta otherwise

Figure 2.7: Operations on sets of types.


2.5.2 Semantics

We now define the semantics of anchored exception declarations by introducingthe boolean δ and ω functions. The δ function determines the exceptional behaviorof a particular invocation of a method, while the ω function determines the worst-case exceptional behavior of a method. The Υ and Ω functions, which are used toexpand anchored exception declarations and insert call-site type information intothe δ function, are defined in Section 2.5.3.

Definition 2.5.1 The δ function determines whether or not an exception clauseor declaration allows a checked exception E to be signalled when the parent methodof the exception declaration is invoked by the given method invocation. It addscontext awareness to exception declarations.

• A method, when invoked as t.m(args), is allowed to signal a checked excep-tion E if at least one of its exception declarations allows E to be signalled.

δ(ED1,. . . , EDn, t.m(args), E) ⇔∨i=n

i=1δ(EDi, t.m(args), E)

• An absolute exception declaration allows a checked exception E to be signalledif it is explicitly propagated and is not blocked.

δ((P, B), t.m(args), E) ⇔ E E (P − B)

• An anchored exception declaration allows a checked exception E to be sig-nalled if the exception clause resulting from its expansion after insertingcontext information allows E to be signalled.

δ(like ta.ma(argsa) E Pa 6E Ba, t.m(args), E) ⇔ω(Υ(Ω(like ta.ma(argsa) E Pa 6E Ba, t, args)), E)

Note that the definition for anchored exception declarations is written in termsof the ω function. After the call-site type information is inserted, it computes theworst-case behavior for the more specific type information. This is done by insert-ing that information into the anchored exception declaration and then using the ωfunction to compute the worst-case behavior of the modified anchored exceptiondeclaration.

Definition 2.5.2 The ω function determines the worst-case behavior of an ex-ception clause or declaration. It is a shorthand form for the δ function when thetarget is the parent type of the method and the actual arguments are references tothe formal parameters of the method.

• ω(ED1, . . . , EDn, E) ⇔ ω(ED1, . . . , EDn, E, ∅)

• ω((P, B), E) ⇔ ω((P, B), E, ∅)

• ω(like t.m(args) E P 6E B, E) ⇔ ω(like t.m(args) E P 6E B, E, ∅)


• ω(ED1, . . . , EDn, E, trace) ⇔∨i=n

i=1ω(EDi, E, trace)

• ω((P, B), E, trace) ⇔ E E (P − B)

• ω(like t.m(args) E P 6E B, E, trace) ⇔

Γ(t).m(Γ((args))) 6∈ trace∧

ω(Υ(like t.m(args) E P 6E B), E, Γ(t).m(Γ(args)) ∪ trace)

The ω function keeps a trace of the static types of both the implicit and explicitarguments passed to a method. Because only the types of the argument determinethe selection of methods, the function can stop if the types if it reaches a methodwith arguments of the same type as before.

The Υ and the Ω functions insert more specific type information into themethod expression of an anchored exception declaration. As a result, it can se-lect a more specific method and thus reduce the set of exceptions that can besignalled.

2.5.3 Exploiting Context Information

The meaning of an anchored exception declaration is defined in terms of a processcalled expansion, denoted by Υ, which is performed at compile-time. Expandingan anchored exception declaration is the process of cloning the exception clause ofthe referenced method and adapting it to include the context information.

The power of expansion depends on the programming language. The moreinformation can be specialized in subtypes or at a call-site, the more powerfulthe expansion process is. Features that increase the power of expansion includecovariant return types, type parameters, and type anchors.

We assume that the program is well-typed with respect to the type systemof ClassicJava. Figure 2.8 shows the list of symbols and the signatures of theintroduced functions.

Substitution

Inserting context information into an exception clause is done by the Ω function.It substitutes formal parameters and the implicit argument this with call-siteinformation.

Under the assumptions made in this chapter, applying the Ω func-tion to expression e is equal to the substitution of actual arguments:val1/par1 . . . , valn/parn, target/thise. If static methods, syntactic overloading,and overloading of instance variables are allowed, this is no longer the case becauselookups of instance variables and signatures are influenced by the insertion of morespecific type information. In this case, type elaboration can be used to take thestatic binding into account, as done in [FKF98].


The definitions for expressions, exception declarations, and exception clausesare shown in Figure 2.9a. The <: relation is used to denote subtyping for typesand overriding for methods, the Γ function returns the type of an expression.

Filtering

The Φ function applies the filter clauses Pnew and Bnew of an anchored exceptiondeclaration to an exception clause. The propagated exceptions of an exceptiondeclaration are combined with Pnew using an intersection. The blocked exceptionsare combined with Bnew using a union. The function is shown in Figure 2.9b.

Expansion

The expansion of an anchored exception declaration , performed by the Υ function,selects the exception clause of the invoked method, done by the ε function, andapplies the Φ and Ω functions to the result. Because the static types of the actualarguments and the target are subtypes of the formal parameters and the parenttype of the invoked method, a more specific method may be selected. As a result,a number of checked exceptions may be eliminated. The definition of the functionis shown in Figure 2.9c. In the definition, pi is the formal parameter correspondingto actual argument ai.

Recursive Expansion

The Υrec function gives an upper bound for the types of checked exceptions thatcan be signalled by a method invocation or declared by an exception declara-tion or an exception clause. The ⊖ operator calculates the worst case exceptiontypes for an absolute exception declaration by removing propagated types thatare completely blocked and ignoring blocked types that do not completely blocka propagated type. Taking the latter blocked types into account does not providemore safety or flexibility because one of their supertypes must already be prop-agated or handled.To prevent infinite loops in this process, we keep a trace ofthe static type of the arguments – both implicit and explicit – that are passed toa method. Because the selection of further methods in the graph is done basedonly on the types of the arguments, we can stop if those types have already beenpassed before. Because our program is finite, so is the number of possible differentcombinations of the type of the arguments, which means that the process mustreach a fixed point with respect to these types, and stop. The function is shownin Figure 2.9d.

2.5.4 Restrictions on Anchored Declarations

In this section, we will discuss the restrictions on anchored exception declarations.


Use in conjunction with the grammar in F igure 2.4.

ExceptionType ::=ExceptionDeclaration|ExceptionClauseExprContainer ::=Expr|ExceptionTypeMethodReference ::= MethodInvocation|AnchoredExceptionDecl

Ω =substitute actual argumentsΩ :ExprContainer → Expr → (Expr, Formal) list → ExprContainer

Φ = filter blocked typesΦ : ExceptionType → TypeSet → TypeSet → ExceptionType

ε = return exception clause of referenced methodε :MethodExpression → ExceptionClause

Υ = insert context informationΥ :MethodReference → ExceptionClause

Υrec = compute upper bound exceptional behaviorΥrec :(MethodInvocation|ExceptionType) → TypeSet

⊤ = set containing all exception types⊔,⊓,−,⊑= operators corresponding to ∪,∩, \,⊆

E TypeSet = set of propagated exception types6E TypeSet = set of blocked exception types

Γ = return type of an expressionΓ :Expr → Type

Figure 2.8: List of symbols.


Ω(e, target, (v1, p1) . . . (vn, pn)) = v1/p1 . . . , vn/pn, target/thiseΩ((P, B), target, args) = (P, B)Ω(like ta.ma(argsa) E Pa 6E Ba, target, args) =

like Ω(ta.ma(argsa), target, args) E Pa 6E Ba

Ω(ED1, . . . , EDn , target, args) =Ω(ED1, target, args), . . . , Ω(EDn, target, args)

(a) Substitution

Φ((P, B), Pnew , Bnew) = (P ⊓ Pnew, B ⊔ Bnew)Φ(like t.m(args) E P 6E B, Pnew, Bnew) =

like t.m(args) E (P ⊓ Pnew) 6E (B ⊔ Bnew)

Φ(ED1, . . . , EDn, Pnew, Bnew) =Φ(ED1, Pnew, Bnew), . . . , Φ(EDn, Pnew , Bnew)

(b) Filtering

ε(t.m(args)) = exception clause of referenced methodΥ(like t.m(a1, . . . , an) E P 6E B) =

Ω(Φ(ε(t.m(a1, . . . , an)), P, B), t, (a1, p1) . . . (an, pn))

Υ(t.m(a1, . . . , an)) = Υ(like t.m(a1, . . . , an) E ⊤ 6E ∅)

(c) Expansion

Υrec((P, B), trace) = P ⊖ BΥrec(like t.m(args) E P 6E B, trace) =

if Γ(t).m(Γ(args) ∈ trace then∅

else

Υrec(Υ(like t.m(args) E P 6E B), Γ(t).m(Γ(args) ∪ trace)

Υrec(ED1, . . . , EDn, trace) =Υrec(ED1, trace) ∪ . . . ∪ Υrec(EDn, trace)

Υrec(t.m(args)) = Υrec(like t.m(args) E ⊤ 6E ∅, ∅)

(d) Recursive expansion

Figure 2.9: Definition of the expansion function.


Accessibility Rule

The client of a method must have access to every element of an anchored exceptiondeclaration in order to determine which exceptions to expect when invoking themethod. This is similar to the precondition availability rule of Eiffel [Mey97] andthe accessibility constraints imposed on types used in method signatures in C#[ECM02].

Rule 1 All elements of an anchored exception declaration must have at least thelevel of accessibility that the declaring method has.

Conformance Rules

An exception clause ECa conforms to another exception clause ECb, denoted asECa ECb, when ECa never allows a checked exception that ECb does not allow.For a valid program, the following conformance relations must hold. The functionsand relations used in these rules will be explained further on.

Rule 2 A method may not signal a checked exception when one of the methods itoverrides does not allow it.

ma <: mb ⇒ ε(ma) ε(mb)

Rule 3 The implementation of a method may not signal a checked exception whenthe exception clause does not allow it.

¬ m abstract ⇒ IEC(m) ε(m)

As a result of these rules, the exception clauses of the overridden methods actas upper bounds, while the exception clause defined by the implementation of amethod acts as a lower bound.

The conformance relation:

We introduce the relation in order to simplify reasoning about anchoredexception declarations. For compile-time safety, it suffices to require thatδ(ECa, t.m(args), E) ⇒ δ(ECb, t.m(args), E) holds between a method and themethods it overrides and between a method body and the exception clause of thatmethod. In a full-blown programming language, however, this becomes difficult toreason about because of concepts such as static and final methods. They allowECa to be a valid refinement of ECb based on the knowledge that some methodscannot be overridden. Such an analysis is hard for a programmer to do and wouldthus cause confusion when a certain type of transition of exception clauses wouldbe accepted in one part of a program, but rejected in another part because themodifiers of the methods involved are slightly different.


thisa thisb ⇔ Γ(thisa) <: Γ(thisb)

expression T ⇔ Γ(expression) <: T

formala formalb ⇔ formala ∼= formalb

new A(a1, . . . , an) new B(b1, . . . , bn) ⇔ A = B ∧(

∧i=n

i=1ai bi

)

ta.vara tb.varb ⇔ ta tb ∧ vara = varb

ta.ma(a1, . . . , an) tb.mb(b1, . . . , bn) ⇔ ma = mb ∧ ta tb ∧(

∧i=n

i=1ai bi

)

(a) Expressions

(Pa, Ba) (Pb, Bb) ⇔ (Pa − Ba) ⊑ (Pb − Bb)

like ta.ma(arga,1, . . . , arga,n) E Pa 6E Ba like tb.mb(argb,1, . . . , argb,n) E Pb 6E Bb

mta.ma(arga,1, . . . , arga,n) tb.mb(argb,1, . . . , argb,n) ∧

(Pa − Ba) ⊑ (Pb − Bb)

(b) Exception declarations

ECa ECb ⇔ ∅ ⊢ ECa ECb

trace ⊢ ECa ECb

m∀ (Pa, Ba) ∈ECa, ∀ E | ω((Pa, Ba), E) : (1)

∃(Pb, Bb) ∈ ECb : Φ((Pa, Ba), E, ∅) (Pb, Bb))∧ ∀ anchora ∈ ECa, ∀ E | ω(anchora, E) : (2)

∃ anchorb ∈ ECb : Φ(anchora, E, ∅) anchorb (2.a)∨ κ(anchora, ECb) 6∈ trace ⇒ (2.b)

κ(anchora, ECb) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECb)

(c) Exception clauses

Figure 2.10: The conformance relation .


A method expression MEa conforms to MEb, denoted as MEa MEb, whenthe evaluation of MEa always results in a method that is equal to, or overrides themethod resulting from the evaluation of MEb. Consequently, if MEa MEb, themethod selected by MEa can never signal an exception that is not allowed by themethod selected by MEb because of rules 2 and 3. The relations are shown in Fig-ure 2.10a. The ∼= relation denotes that both formal parameters are correspondingformal parameters of overriding or equal methods.

For absolute exception declarations, the set of exceptions declared by (Pa, Ba)must be a subset of the exceptions declared by (Pb, Bb). For a anchored exceptiondeclarations anchora and anchorb, the method expression and the filter clause ofanchora must conform to those of anchorb. The filter clauses follow the same ruleas absolute declarations. Both relations are shown in Figure 2.10b.

The relation for exception clauses is shown in Figure 2.10c. The first con-dition (1) is equivalent to the traditional exception conformance rule for checkedexceptions. It ensures that every checked exception allowed by an absolute decla-ration of ECa is also allowed by an absolute declaration of ECb. Note that thisrule forbids transforming anchored exception declarations into absolute declara-tions since an anchored declaration promises that an exception can be signalledonly by the anchor, which is not the case for an absolute declaration. The set ofchecked exceptions for which ω((Pa, Ba), E) is true is Pa ⊖ Ba.

The second condition states that an anchored exception declaration anchorb

of ECb may be removed, copied, or replaced by an anchored exception declarationthat conforms to anchorb (2.a), and that a part of ECb may be replaced by ananchored exception declaration that expands to an exception clause that conformsto ECb (2.b). Note that the result of Υrec(anchora) is the set of checked exceptionsfor which ω(anchora, E) is true.

Rule 2.b allows replacing a part of exception clause ECb by an anchored excep-tion declaration if the expansion of that anchored exception declaration conformsto ECb. For example, E1, like a().g() may be replaced by like a().f() whenthe exception clause of f() is E1, like g(). This is a valid transformation be-cause it adds no extra exceptions or circumstances under which exceptions can besignalled; the expansion conforms to the original exception clause. It does howevercreate an opportunity for reducing the circumstances under which the exceptionscan be signalled, by handling them in method f().

To prevent infinite loops during type checking, we must introduce a stoppingcondition. The algorithm can encounter two kinds of loops: either it tries to expandan anchored exception declaration that it has already expanded before, or it triesto expand anchored exception declarations that keep getting bigger. In the firstcase, the analysis can stop because no problems were found when analyzing thatpath before. In the second case, the algorithm must determine when the anchoredexception declarationhas grown so big that it does not make a difference if itgrows any bigger. To achieve this, a compressed form of an anchored exception


declaration is kept in the trace when expanding it. The κ function, compressesthe anchored exception declaration by replacing useless parts in the target andarguments of the anchored exception declaration by their static type. A uselesspart of an anchored exception declaration with respect to an exception clause is apart the will never be used to decides whether or not the expansion of the anchoredexception declaration conforms to that exception clause. Because the compressedform of an anchored exception declaration can never get bigger than the biggestanchored exception declaration in ECb, and because the program is finite, thestopping condition is reached after a finite number of iterations.

The definition of the κ uses the ρ function, which is shown in Figure 2.11, todetermine the useful part of the anchored exception declaration. The ∅ symbolis used to represent an empty expression. The ⊕ operator is used to concatenateexpressions. If one of both operands is an empty expression, it returns the otheroperand. It uses this to calculate the useless part by subtracting it from the methodexpression of the anchored exception declaration. Finally, it replaces the uselesspart by its static type. The definition of κ is shown in Figure 2.12.

The ∀E quantifications over all signalled checked exceptions provide addi-tional flexibility. Without them, it would not be allowed to replace like m()

propagating (E1), like m() propagating (E2) by like m() propagating

(E1,E2), which only merges the filter clauses.The conformance rule for exception clauses is related to the rules for refine-

ment of reuse contracts [SLMD96], and the rules for conformance declarationsof Contracts [Lam93]. These rules enforce the substitution principle with respectto the specification of dependencies between methods. They involve either directconformance of elements, like rules 1 and 2.a, or conformance when taking thetransitive closure of dependencies into account, like rule 2.b.

The Implementation Exception Clause

The implementation exception clause (IEC) of a method is a calculated exceptionclause that is an upper bound for the exceptional behavior of the implementationof that method. The algorithm to compute the IEC is similar to the encountersfunction presented by Robillard and Murphy [RM03], and is shown in Figure 2.13.We do not give the definitions for every statement and expression, but only forthe elements that are interesting with respect to the exception flow.

The IEC is derived from a set of pairs containing the type of a checked excep-tion and an exception declaration. The exception declaration represents a part ofthe exceptional behavior of the implementation, while the exception type is usedto filter pairs when an exception handler is encountered. For a checked exceptionthat is raised directly, the pair contains the static type of the exception as its firstand second element. For a checked exception originating from a method invoca-tion, the first element is the static type of the exception, and the second elementis an anchored exception declaration containing the method invocation and a fil-


a 6= ∅ b 6= ∅

a ⊕ b = a.b

b = ∅a ⊕ b = a

a = ∅a ⊕ b = b

thisa 6 thisb

ρ(thisa, thisb, acc) = thisa ⊕ acc

expra 66 thisb

ρ(expra, thisb, acc) = ∅

expra 6 Tb

ρ(expra, Tb, acc) = acc

expra 66 Tb

ρ(expra, Tb, acc) = ∅

formala 6 formalbρ(formala, formalb, acc) = formala ⊕ acc

expra 66 formalbρ(expra, formalb, acc) = ∅

ta.vara 6 tb.varb

ρ(ta.vara, tb.varb, acc) = ρ(ta, tb, vara ⊕ acc))

expra 66 tb.varb

ρ(expra, tb.varb) = ρ(expra, tb, ∅)

ta.ma(argsa) 6 tb.mb(argsb)

ρ(ta.ma(argsa), tb.mb(argsb), acc) = ρ(ta.ma(argsa), tb, ma(argsa) ⊕ acc)

expra 66 tb.mb(argsb)

ρ(expra, tb.mb(argsb), acc) =

max(ρ(ta.ma(argsa), tb, ∅), ρ(ta.ma(argsa), argsb, ∅))

new Ta(argsa) 6 new Tb(argsb)

ρ(new Ta(argsa)), new Tb(argsb), acc) = new Ta(argsa) ⊕ acc

new Ta(argsa) 66 new Tb(argsb)

ρ(new Ta(argsa)), new Tb(argsb), acc) = max(ρ(new Ta(argsa), argsb, ∅))

ρ(expra, like tb.mb(argsb)) = ρ(expra, tb.mb(argsb), ∅)

ρ(expra, ECb) = max(ρ(expra, anchorb))

Figure 2.11: Calculating the useful remainder of anchored exceptiondeclarations.


µ(expr, EC) = expr − ρ(expr, EC)

κ(expr, EC) = Γ(µ(expr, EC)) ⊕ ρ(expr, EC)

κ(like t.m(args) E P 6E B, EC) = like κ(t, EC).m(κ(args, EC)) E P 6E B

Figure 2.12: Compression of anchored exception declarations.

Ψ([[throw e]]) = (Γ(e), (Γ(e), ∅)) ∪ Ψ([[e]])

Ψ([[t.m(args)]]) =(E, like t.m(args) E E)|E ∈ Υrec(t.m(args)) ∪ Ψ([[t]]) ∪ Ψ([[args]])

Ψ([[trytbcatch(E1 e1)h1 . . . catch(En en)hnfinallyfin]]) =(E, ED)|(E, ED) ∈ Ψ([[tb]]) ∧ ∄x ∈ [1, n] : E <: En∪(

⋃i=n

i=1Ψ([[handleri]])

)

∪ Ψ([[fin]])

strip((E1, ED1), . . . , (En, EDn)) = ED1, . . . , EDn

IEC(method) = strip(Ψ(body(method)))

Figure 2.13: Calculation of the implementation exception clause.


public abstract class C public A getA() . . . public B getB() . . . public abstract void hookMethod(A a, B b) throws E1,E2;public void template() throws E3,

like hookMethod(getA(), getB()) blocking (E2) try

hookMethod(getA(), getB());//(E1, like hookMethod(getA(), getB()) propagating (E1))//(E2, like hookMethod(getA(), getB()) propagating (E2))if(...)

throw new E3();// (E3, E3)

catch(E2 exc)

... // no exceptions signalled here

Figure 2.14: Calculating the implementation exception clause.

ter clause propagating only that type of exception. Adding multiple pairs thatpropagate only a single exception simplifies the formula for exception handlers.

A try-catch-finally block removes all exception pairs for which the excep-tion type can be handled by one of its catch blocks. After that, exception pairsare added based on the code in the catch blocks and the finally block.

Once the set is constructed for the method body, the implementation excep-tion clause can be obtained by constructing an exception clause that contains theexception declaration of each pair. Note that the algorithm does not always yieldthe lowest upper bound since the function for try-catch-finally blocks discardsanchor relations when an exception is directly propagated.

Figure 2.14 illustrates the algorithm. The exception pairs are written in thecomments after the corresponding statements. Exceptions E1 and E2 are unrelated.Exception E2 is caught by a catch clause that does not signal an exception. There-fore, all pairs within the body of the try statement that have E2 as exception typecan be removed. The resulting set of pairs is:

(E3,E3),(E1,like hookMethod(getA(),getB())

propagating (E1))

The resulting implementation exception clause is shown below. It is clear that itconforms to the exception clause of the template method since (E1 − ∅) ⊑(⊤ − E2) when there is no inheritance relation between E1 and E2.


E3,like hookMethod(getA(),getB())

propagating (E1)

2.5.5 Proof of Compile-time Safety

In this chapter we only give an overview of the proof of compile-time safety andthe most interesting theorems. The full proof is shown in Appendix A.

Theorems 2.5.1 and 2.5.2 state that the Φ and Ω functions are monotone withrespect to the pre-order when the same context information is inserted in bothoperands or when more specific context information is inserted in the left-handoperand.

Theorem 2.5.1 Φ is monotone.

ECa ECb ∧ (Pc − Bc) ⊑ (Pd − Bd)⇓

Φ(ECa, Pc, Bc) Φ(ECb, Pd, Bd)

Theorem 2.5.2 Ω is monotone.

ECa ECb ∧ targeta targetb ∧ argsa argsb ∧okΩ((targeta, this(ECa)), argsa)∧okΩ((targetb, this(ECb)), argsb)

⇓Ω(ECa, targeta, argsa) Ω(ECb, targetb, argsb)

Theorem 2.5.3 states that the relation is a pre-order. The = relation can bechosen such that becomes a partial order by demanding that a b ∧ b a ⇒a = b. That would mean that two exception clauses are equal when they specifythe same exceptional behavior, which makes perfect sense.

Theorem 2.5.3 The relation is a pre-order.

1. is reflexivea a

2. is transitivea b ∧ b c ⇒ a c

Theorem 2.5.4 states that the relation between exception clauses impliesthat the left-hand exception clause never declares a checked exception that is notdeclared by the right-hand exception clause, a property we will need for ensuringcompile-time safety.

Theorem 2.5.4

ECa ECb ⇒ (ω(ECa, E) ⇒ ω(ECb, E))


Theorem 2.5.5 The implementation exception clause of a non-abstract method isan upper bound for the exceptional behavior of the implementation of that method.

Theorem 2.5.6 states that the worst-case behavior of a method body, specifiedby the implementation exception clause, conforms to the exception clause of thatmethod for any specific call-site. As a result, a programmer – and a compiler – canobtain an upper bound for the exceptional behavior of a method call by insertingthe context information into the exception clause of the invoked method.

Theorem 2.5.6 Let t.m(arg1, . . . , argn) be a method invocation in a valid pro-gram, let EC = ε(t.m(arg1, . . . , argn)) and let pari be the formal parameter cor-responding to argi.

IEC EC ∧ Γ(this(EC)) = Γ(this(IEC))⇓

Ω(IEC, t, (arg1, par1) . . . (argn, parn)) Ω(EC, t, (arg1, par1) . . . (argn, parn))

For compile-time safety to be violated, there must be at least one method ofwhich the implementation can signal a checked exception under a circumstancethat could not have been predicted by the client when inspecting the exceptionclause of that method. We now show that this is not possible for a program satis-fying all rules.

Figure 2.15 illustrates the proof. The exception clause of the method is repre-sented by EC, its implementation exception clause by IEC. We know from rule 3that IEC EC, so Theorem 2.5.6 ensures that after insertion of the context infor-mation of any call-site, resulting in EC′ and IEC′, EC′ IEC′ holds. Note thatat run-time, the available context information is even more specific, but becausethe same information is inserted in both exception clauses, the relation betweenIEC′ and EC′ will still hold.

Using Theorems 2.5.4 and 2.5.5, we can derive the right part of the diagram.Theorem 2.5.4 ensures that ω(IEC′, E) ⇒ ω(EC′, E). Theorem 2.5.5 ensures thatimplementation never signals an exception that is not allowed by IEC′. Conse-quently, we can conclude that no method invocation can result in a checked excep-tion that was not declared by the exception clause of the invoked method. Notethat δ(EC, t.m(args), E) is the same as ω(EC′, E) when EC′ = Ω(EC, t, args).

Theorem 2.5.7 states that the expansion of a method invocation declares lessthan the exception clause of the invoked method. This property is not necessary forcompile-time safety, but is crucial from a methodological point of view. If it doesnot hold, a method invocation can allow more checked exceptions to be signalledthan the invoked method declares, which would be safe but very confusing. Forexample, if the Υ function would simply return throws Throwable, compile-timesafety would not be at risk, but anchored exception declarations would becomeuseless.


EC EC′ ω(EC′, E)

IEC IEC′ ω(IEC′, E)

method sig-nals checkedexception

Ω

Ω

Figure 2.15: Schematic proof of soundness.

Theorem 2.5.7ω(Υ(like t.m(args) E P 6E B), E)

⇓ω(ε(like t.m(args) E P 6E B), E)

2.6 Methodological Discussion

2.6.1 Information Hiding

A consequence of the conformance rule is that for a single anchored exception dec-laration like t.m(args), the implementation may signal only checked exceptionscaused by a conform method invocation. But this is a violation of the principle ofinformation hiding [Par02]. The anchored exception declaration reveals informa-tion about the implementation of the method, which must directly or indirectlyexecute t.m(args) for the former to allow a checked exception to be signalled. Sohow can anchored exception declarations fit in the object-oriented programmingparadigm, where information hiding is a crucial concept?

The answer to this question has been given by Helm, Holland, and Gangopad-hyay in [HHG90], by Lamping in [Lam93], and by Steyaert, Lucas, Mens, andD’Hondt in [SLMD96]. In order to specify the behavior of composable softwareelements, and thus allow a client to reuse them, it can be necessary to reveal someof the dependencies between the methods of these elements. In these papers, thedependencies are used to alleviate the fragile base class problem, but they are alsoneeded to write specifications for most design patterns since most of them arebased on delegation.

The contract of a delegator often contains expressions referencing the legate to

2.6 Methodological Discussion 43

allow the derivation of the full contract when the concrete legate is known. If thecontract of the interface of the legate introduces indeterminism in the contract ofthe delegator, and the indeterministic part is relevant for a client of the delegator,the link between the delegator and the legate cannot be hidden. The contractof the delegator promises that the postconditions of the legate will be part ofthe result. But because the indeterminism prevents the exact postconditions frombeing known at compile-time, it is impossible for the delegator to satisfy its owncontract without directly or indirectly evaluating the expressions that referencethe legate.

Consider for example a method forAll(Predicate p, Collection c) thatimplements a universal quantification using a Strategy pattern. It checks whetherall elements in collection c satisfy predicate p, as defined by its eval method.The specification for this method would be result == ∀o ∈ c : p.eval(o). Thisspecification reveals that the implementation of forAll must invoke p.eval(o)

because otherwise it is impossible for the implementation to fulfill its contract.As such, there is no loss of information hiding by also specifying that dependencywith an anchored exception declaration. The exception clause of forAll would bethrows like p.eval(o).

From these arguments, we can derive a rule of thumb regarding the use ofanchored exception declarations:

Guideline 1 Use an anchored exception declaration if the dependency between thedelegator and the legate must be known by a client in order to use the delegator. Donot use an anchored exception declaration if that dependency must remain hiddenfor clients.

2.6.2 Usefulness of Source Code Modifications

As mentioned in Section 2.2.1, the addition of a checked exception triggers modi-fications in other parts of the program. Some of these changes are good: they forcethe programmer to change the code that will not work in presence of the addedexception. Other changes are gratuitous: they force the programmer to do workthat adds no value.

If the modification concerns a method that handles at least one exception, themodification is good. For such a method, an active decision is made to propagatesome exceptions, but handle others, and so it is normal that this decision must berepeated when the exceptional behavior has changed.

For methods that do not handle exceptions, it depends on whether or not themethod already propagated checked exceptions. If the method did not propagatechecked exceptions before, the modification is good. It is not realistic to expect thatthe exceptional specification of a method changes from “no checked exceptions”to “some checked exceptions” automatically. But if the method did already propa-gate checked exceptions before and the new checked exception is also propagated,


Figure 2.16: Adding a new checked exception.

the modification is unnecessary. In this case, the method already propagated allchecked exceptions coming from certain method invocations, so it should not bemodified if such an invocation can result in a new checked exception.

Figure 2.16 illustrates the addition of a new checked exception for both absoluteand anchored exception declarations. The new exception is not always propagatedto the end of the chain such that the situation using anchored exception declara-tions will also require the modification of some methods. The anchored exceptiondeclarations always reference the next method in the chain. The circles representmethods, the lines represent chains of method invocations. The big circle in themiddle is the method where the new exception was added. A circle is white if itis not modified, gray if it is modified and that modification is good, and black ifit was modified unnecessarily. We assume that all anchored exception declarationsare in place.

If only absolute exception declarations are used, as in the left figure, the excep-tion must be propagated manually along the invocation chains until it is handled.The methods that handle the exception are colored gray; these changes are useful.The modifications that merely serve to propagate the exception until it can behandled are unnecessary, and thus colored black. If anchored exception declara-tions are used, as in the right figure, the exception automatically propagates tothe end of the chain of anchored exception declarations. For the exceptions thatshould not reach that point, the programmer can backtrack along the invocationchain until he arrives at the method that should handle the exception. As a result,all methods that cannot deal with the new exception are detected – possibly after

2.7 Comparison with Type Anchors 45

backtracking – and the programmer is still forced to perform all good modifica-tions. No unnecessary modifications must be performed.

2.6.3 Nominal or Structural Typing?

In a nominal type system, the names of types are important, and subtyping rela-tions are declared explicitly 2. Examples are Java, C#, and Eiffel. In a structuraltype system, names of types are not essential, and subtyping is defined on thestructures of types. Examples are Haskell, ML, . . . .

A problem with structural type systems is that it can lead to programmingerrors that are hard to find. An entity – which can be an object or a function,. . . – can accidentilly be used as if it were of type T only because the structure ofits type is similar to that of T. As a result, the entity may not behave accordingto the contracts of type T and lead to run-time errors.

With nominal type systems, the programmer explicitly declares that a type S isa subtype of another type T. As such, an entity of type S can safely be used wherean entity of type T is expected – provided that behavioral subtyping is respected.To assist with enforcing behavioral subtyping, the compiler can guarantee thatsafety on the level of types.

In the spirit of the definition above, the type system for anchored exceptiondeclarations is a nominal type system. Every decision in the definition of the δ, ω,and relations is backed by a nominal declaration. Take for example the definitionof in Figure 2.10. Every branch of the decision tree ends in either case 1or 2.a,which are the leaves in the tree. The internal node of the tree are formed by case2.b. For case 1, the decision is directly based on the nominal subtyping declarations.For case 2.a, the decision is based on the nominal subtyping declarations of theparent types of the referenced methods. This follows from the definition of the relation for anchored exception declarations, which only holds when the methodreferenced by the left-hand side is equal to, or overrides the method on the right-hand side. For case 2.b, the anchored exception declaration that is expanded isthe nominal “subtyping” declaration.

2.7 Comparison with Type Anchors

The anchoring technique has more impact on the exceptional return type of amethod than on the normal return type. The reason for this is that a normal returnvalue can be used through subsumption. The most specific type information is oftennot needed. For an exception, however, the general type is usually not sufficient[MT97]. In this case we need as much information as possible because, by the verynature of an exception handling mechanism, the signaling code is not supposed

2Definition taken from [Pie02]


to know how to handle it. Consequently, he cannot provide an exception that willhandle itself, prohibiting the use of subsumption.

The conformance rule for anchored exception declarations is more flexible thanthe corresponding conformance rule for Eiffel type anchors. In Eiffel, the only typeconform to like anchor is itself. The rule for anchored exception declarationsleaves the opportunity to redefine a part of an exception clause by one or morestronger anchored declarations. The need for this is caused by the difference be-tween the normal and exceptional behavior of a method. Adding an extra layer ofindirection (rule 2.b of Figure 2.10c) is useful for exceptions because some of themmay be handled in the extra layer. For example, a redefined version of the extralayer may declare that it cannot signal any checked exception at all although themethod referenced by the original anchored declaration can. This is not possiblefor the normal return type since there must always be exactly one return typeand that type has already been fixed. Both anchored declarations have a slightlydifferent meaning. An Eiffel type anchor declares that the type is always the sameas the type of the anchor, while an anchored exception declaration declares thatit cannot signal an exception when the anchor cannot.

The difference between the conformance rules results in a difference betweenthe rules to prevents loops while following anchored declarations. The rule of Eiffeltype anchors is weaker than the rule for anchored exception declarations becauseit is not allowed to redefine the type of an anchored declaration in Eiffel. As aresult, it suffices to demand that there is no loop in the anchor chain.

2.8 Translating Cappuccino to Java 47

2.8 Translating Cappuccino to Java

We have implemented anchored exception declarations as a variant of ClassicJava,called Cappuccino. We have done this by adding elements representing anchoredexception declarations to Jnome [vD06], our metamodel for Java, along with thealgorithms necessary for validation. The extended metamodel reads ClassicJavafiles containing anchored exception declarations and checks all the rules they mustadhere to. The prototype compiler is available at http://www.jnome.org.

A translator is provided to transform Cappuccino programs into plain Javaprograms. It replaces anchored exception declarations by absolute exception dec-larations and, if necessary, adds dummy exception handlers for checked exceptionsthat cannot be signalled. This is done by performing the following steps for eachmethod.

1. Transform anchored exception declarations into absolute exception declara-tions, which are calculated by the Υrec function.

2. Remove redundant exception types from the new exception clause. That way,we can add exception handlers in any order.

3. Generate a unique name for the parameter of the catch clauses.

4. Surround the body of the method with a try block.

5. Add catch clauses for Error and RuntimeException that propagate theexception.

6. For each checked exception declared by the new exception clause, calculate ifit can be signalled by the method body by applying Υrec to the implementa-tion exception clause. If that is the case, add a catch clause that propagatesthe exception.

7. Finally, if Throwable is not already propagated, add a catch clause forThrowable that raises an Error. For a correct program this code will never beexecuted. Raising an Error in this handler can reveal some version conflictsbetween two parts of generated code.

Note that translation to Java is not ideal in terms of performance. By addingthe exception handlers, every signalled exception will be caught and re-raised forevery stack frame until the relevant handler is encountered. To restore the effi-ciency, the Java compiler and virtual machine must be adapted such that they donot require the dummy exception handlers anymore.

Figure 2.17 contains the generated code for the client method of Figure 2.6.


public Result client(MyStrategy myStrategy)throws mypackage.MyException

try ......template(myStrategy);...

catch (java.lang.RuntimeException Z)

throw Z;catch (java.lang.Error Z)

throw Z;catch (mypackage.MyException Z)

throw Z;catch (java.lang.Throwable Z)

throw new java.lang.Error();

Figure 2.17: Generated Java code.

2.9 Related Work

The Java programming language offers a compromise between robustness and flex-ibility by providing both checked and unchecked exceptions [GRRX01]. But asshown in Section 2.2, that is not sufficient.

Mikhailova and Romanovsky [MR01] provide support for evolution of the ex-ceptional behavior of a method by introducing a rescue clause. A rescue clause isa default exception handler that allows a method to have an exception clause thatdoes not conform to the methods it overrides. If a client of that method providesa handler for the new exception, that handler is used, otherwise the rescue clausehandles the exception. This mechanism only provides a solution when a useful de-fault handler can be provided, which usually is not the case. Anchored exceptiondeclarations are complementary to the rescue clause. The rescue clause allows aprogrammer to signal new exceptions for which a default handler can be provided,while anchored exception declarations can be used when such a handler cannot beprovided.

Romanovsky and Sanden [RS01] show that an exception handling mechanismshould correspond to the features of the language. We have shown that the ex-

2.9 Related Work 49

ception clause of a method in object-oriented programming languages is not asexpressive as the implementation of a method with respect to delegation. By re-moving this difference, many problems with checked exceptions are solved.

Miller and Tripathi [MT97] analyze the conflicts between exception handlingand object-oriented programming. Our work is related to these conflicts in severalways. By bringing context information into the exception clause, anchored excep-tion declarations reduce – but do not eliminate – the conflict between exceptionconformance and complete exception specification. Specific information about theexceptional behavior of an overriding method can still be used when the interfaceof the overridden method has a general exception for conformance reasons. Theauthors also argue that exception handling increases coupling in object-orientedprograms. Anchored exception declarations do not increase coupling when usedproperly, and they decrease coupling with respect to the adaptability of the pro-gram. Finally, Miller and Tripathi discuss the need for evolution of the exceptionalbehavior of a method. They briefly suggest that a language should allow exceptionnon-conformance and the ability to add exception handlers to existing code.

Lippert and Lopes [LL99] simplify exception handling by using aspect-orientedprogramming. Their approach focuses on removing redundant exception handlers,and can be used for adding the dummy exception handlers and propagating excep-tions. Using aspect-oriented programming can be very useful when the exceptionhandlers are meaningful, but for checked exceptions it does not solve the adapt-ability problem and the program still suffers from hazardous situations under evo-lution. Anchored exception declarations solve these aspects of exception handlingin a better way.

Specification of the dependencies between methods has been presented by Helmet al. [HHG90], by Lamping [Lam93], and by Steyaert et al. [SLMD96]. Theypresent their work using the normal behavior of a method, but their techniques alsoapply to the exceptional behavior of a method. Anchored exception declarationsprovide these dependencies for the exceptional behavior in a way that is verifiableby a compiler.

There has been a lot of work on the analysis of the exception flow in programs.We will first describe the different approaches, and then compare them to ourwork. Every approach uses a different mechanism to track the dependencies be-tween methods or functions, and automatically infers the exception clause of everymethod.

In [CJYC01, FA97a, FA97b, GS94, JCYC04, PL99, Yi94, YR97], the typesystem of the underlying programming language is augmented with informationabout the exceptions that can be signalled by a particular language construct. Totrack the dependencies between methods or functions, they insert type variables ofthe invoked method or function into the exceptional type of the invoking methodor function. In [GSSS02], the type system is augmented with boolean constraintsin order to track these dependencies.


In [SH98, AH03, CGHS99], the analysis is done using a control flow graph. Thenormal and exceptional program flow of each method is encoded in a separategraph. To analyze interprocedural exception flow, edges are created between theinvolved graphs. Exceptional exit nodes of the invoked method are connected tonodes representing exception handlers or exit nodes of the invoking method. Thesenodes represent the dependencies between methods.

Robillard and Murphy [RM03] developed a language-independent model foranalyzing the exception flow in object-oriented programs, along with a tool specif-ically for Java. Their analysis is similar to that of Schaefer and Bundy[SB93]. Theyuse functions instead of type variables to incorporate the exceptional behavior ofother methods. They also discuss the cost of modifying the exception clause of amethod, and the use of unchecked exceptions as a result. In [RM00], they showthat the difficulty in determining all exceptional conditions in advance gives riseto the need for evolution of the exceptional behavior of a method.

The expressiveness of the inferred exception clauses of the above approaches isequal to that of ours – disregarding minor details. In the actual analysis, there willbe a significant difference. In our approach, the exception clause is an explicitlywritten upper bound for the exceptional behavior of a method, causing a loss ofprecision. The automatically inferred exception clause of the other approaches willbe as tight as possible. We believe, however, that our approach is better in anobject-oriented setting.

The automatically inferred exception clause of a method m is the union of theexception clauses of all overriding methods and the exceptional behavior of itsown implementation, if any. This means that the exception clause of m is based onthe specific implementations of that method instead of a general statement aboutmethod m. As a result, every newly added method overriding m must conform tothe exceptional behavior of the methods that were already present. Otherwise,the entire program must be analyzed again because the analysis was based onthe inferred behavior for m, and the program may need to be modified. This is atypical consequence of exposing too much information – in this case, every methodinvocation in the method body that can result in a checked exception. As such,the higher a method is in the class hierarchy, the more unstable it becomes. Inorder to keep a program extensible, a sensible upper bound must be chosen for theexceptional behavior of a method, not the tightest upper bound for a given codebase.

An analysis based on automatically inferred exception clauses remains usefulfor object-oriented programming languages. It can be used to obtain a more preciseanalysis of the exception flow of a particular program, and thus exclude someexceptions that our approach cannot exclude due to exception clauses that are tooloose for that particular program.

2.10 Conclusion 51

2.10 Conclusion

We have shown that problems with checked exceptions, such as reduced adapt-ability and loss of context information, are caused by the lack of expressivenessof the exceptional return type of a method. By introducing anchored exceptiondeclarations, we have paved the road for a broader acceptance of checked excep-tions. They bring the benefits of unchecked exceptions to the exception clause byallowing the exceptional behavior of a method to be declared relative to othermethods. This results in better adaptability of software, more elegant code, andeliminates most of the dangerous exception handlers.

We have defined the formal semantics of anchored exception declarations, andthe rules they must adhere to in order to ensure compile-time safety, and we haveproved the soundness. We have shown that anchored exception declarations do notviolate the principle of information hiding when used properly, and have presenteda guideline for when to use them, and when not to use them. In addition, we havedefined criteria to determine which modifications caused by the evolution of theexceptional behavior of a method are good and which modifications are gratuitous.

Finally, we have implemented anchored exception declarations in Cappuccino,an extension of ClassicJava. A translator validates Cappuccino programs andtransforms them into Java programs.

Chapter 3

Composition of Abstract

Data Types

Rosanoff:

Mr. Edison, please tell me what laboratory rules youwant me to observe.

Edison:

Hell, there ain’t no rules around here! We’re trying toaccomplish something!

Martin Andre Rosanoff “Edison in His Laboratory”

3.1 Introduction

Although increasing the reusability of software is one of the main goals of object-oriented software development, an important group of software elements still can-not be reused in a practical manner. These elements are implemented over andover again, resulting in massive code duplication and all its related problems.

A class often consists of application specific functionality written on top ofgeneral purpose functionality. This can be simple functionality like associations,values lying within bounds, lockable values, and infrastructure for event listeners,but also more complex functionality such as graph structure and arbitrary col-laborations. Most of these high-level concepts are easy to use during the designphase. But during the implementation phase, these concepts are transformed intolow-level code because current reuse mechanisms cannot cope with such reuse ina convenient manner.

53

54 Composition of Abstract Data Types

Most reuse mechanisms [BC90, Cha04, Cha06, CMM06, oTC05, RT06,SDNB03, Str91] differ little from a regular inheritance relation with subtypingand code inheritance. But the requirements for building a class from componentsdiffer in important areas from those for creating a subtype. Reusing a class asa building block for another class requires activities such as removing unwantedmethods, wiring method dependencies, and especially renaming methods. But forcreating subtypes, the first activity is forbidden, the second one is not required,and the third one is required only infrequently. In addition, methods of differentbuilding blocks are usually separated even if they have the same definition, whilethey are usually merged in case of a multiple/repeated subtyping relation.

Reuse mechanisms that focus on composition [MO02, SC00] create only shallowcompositions; the composition is just the sum of the parts. But a class is morethan the sum of its components; it adds application specific code and gives thecomponents an application specific meaning; it creates an abstract data type.

In [OZ05], Odersky and Zenger identify three abstractions for removing hardreferences from components to increase their reusability: abstract type members,selftype annotations, and modular mixin composition. Abstract type members andselftypes specify the required services of a component, and mixins perform thecomposition. But while these abstractions are scalable with respect to the size ofthe components, they are not scalable in the way components are used. The prob-lem is that both selftypes, and mixins as used in Scala, prohibit any compositioninvolving multiple components of the same kind, or components containing fea-tures with the same name. So although the authors claim that these abstractionscan lift an arbitrary assembly of static program parts to a component system, theyalready fail for the application in Figure 3.1, which is little more than an assemblyof four kinds of static program parts.

In this chapter, we present an inheritance mechanism with two relations. Thesubtyping relation is used for traditional subtyping inheritance. The componentrelation allows general purpose characteristics to be encapsulated in classes and bereused conveniently as configurable building blocks for other classes. We analyzethe requirements necessary to realize this kind of reuse, and then introduce therequired new features. We introduce renaming parameters for mass renaming, andmake the inheritance relation first-class for accessing hidden functionality, treatingcomponents as separate objects, and resolving method dependencies using high-level component connections. We evaluate the mechanism in a case study, whereit is compared to existing approaches. We also created a formal type system andproved the type soundness of the mechanism.

Overview

In Section 3.2, we analyze the requirements for the reuse mechanism, and discussexisting mechanisms. In Section 3.3, we present the component relation, which isused for code inheritance. In Section 3.4, we present the impact on the subtyping

3.2 Requirements Analysis 55

Figure 3.1: High-level design of an application.

relation. We evaluate the inheritance mechanism in Section 3.5 with an exampleand a case study. We present a part of the formal model, and the proof of typesoundness in Section 3.6. We discuss related work in Section 3.7, and conclude inSection 2.10.

3.2 Requirements Analysis

In this section, we analyze which features are required in order to convenientlyreuse general purpose classes as building blocks for other classes. We use a simplebanking application to illustrate the requirements.

We illustrate the features of the inheritance mechanism mostly with compo-nents for modeling associations, which use a simple protocol to keep the associationconsistent. The proposed inheritance mechanism, however, can reuse general ab-stract data types – which can use arbitrarily complex protocols – as components.

Figure 3.1 illustrates the banking application. It contains classes for persons,bank accounts, and bank cards. The rectangles inside a class represent its char-acteristics. For example, an account has a balance, which is a number that liesbetween the credit limit and an upper bound. In addition, it has a unidirectionalassociation with its account number, and a bidirectional association with its owner.A person has a unidirectional association with his or her name, and bidirectionalassociations with his or her parents, children, and bank accounts. The associationsfor the parents and children form a graph offering different traversal strategies.There are dependencies between some characteristics, which are represented by


class BankAccount public BankAccount(int number)

this.creditLimit = -1000;this.upperLimit = 1000000;this.accountNumber = number;

private Person owner;public Person getOwner()

return owner;public void setOwner(Person owner)

if(this.owner != owner) registerOwner(owner);if(owner != null)

owner.registerAccount(this);

protected void

registerOwner(Person owner) if (this.owner != null)

this.owner.unregisterAccount();this.owner = owner;

protected void unregisterOwner()

owner = null;private final int accountNumber;public int getAccountNumber()

return accountNumber;

private long balance;private long upperLimit;private long creditLimit;

public long getBalance() return balance;

public void deposit(long amt)

if((amt > 0) &&(balance<=Long.MAX VALUE-amt)&&(balance + amt <= upperLimit))

balance += amt;public void withdraw(long amt)

if((amt > 0) &&(balance>=Long.MIN VALUE+amt)&&(balance - amt >= creditLimit))

balance -= amt;public long getUpperLimit()

return upperLimit;public long getCreditLimit()

return creditLimit;

Figure 3.2: The Java version of BankAccount.


the dashed arrows. For example, the owner characteristic of an account must knowthe method names of the accounts characteristic of a person in order to keep theassociation consistent.

Figure 3.2 shows a basic Java implementation of class BankAccount. The codefor the owner and accountNumber characteristics is shown in the left column,and the code for the balance characteristic in the right column. Because of spaceconstraints only the basic methods are shown. More advanced functionality likesending events, or validity checks on the owner of an account,. . . is not shown.

The problem with the implementation is that the class is entirely composed offunctionality that has already been implemented thousands of times before. Bidi-rectional associations and values bounded by lower and upper limits are commoncharacteristics. And although the exact names of the methods and the used typesmay differ, the behavior is always the same.

3.2.1 Requirements

We have created a list of requirements for a reuse mechanism in order to supportcomposition of abstract data types. The requirements are the result of our ex-perience with different software projects. We illustrate the requirement using theexample from Figures 3.1 and 3.2, which is created such that it needs most of therequirements.

The goal is to construct a reuse mechanism that allows characteristics to beencapsulated and reused. The reusable entity is called a component for the remain-der of this chapter. The mechanism must minimize the effort required to reuse acomponent, and maximize the reusability of its features.

Some features are split up in parts to simplify the comparison between existingreuse mechanisms in Section 3.2.2. We omit trivial features and features supportedby all mechanisms. Note that many of these requirements have introduced byothers, as illustrated by the support matrix in Section 3.2.2. We have grouped thefeatures according to their purpose to increase the readability. Some features couldbe placed in more than one group.

Mandatory Features

The following features are mandatory for building the foundation of a class fromcomponents. Without them, the reuse of components is impossible.

1. Composition of Abstract Data Types: The reuse mechanism must allowthe components to contribute to the abstract data type of the composition.Otherwise, it can form only shallow compositions which are nothing morethan the sum of the parts. In the example, a Person would just be a com-bination of four associations and a graph. As a result, clients of a Person

would have to invoke methods by selecting one of the components and in-voking a method with a general name instead of directly invoking a method


with an appropriate name. Things get even worse with nested compositions.If the bidirectional association reuses a unidirectional association for its toreference, accessing the other end of the association involves two componentselections and a method invocation. This requires more work, and differentmethods of a single composition must be accessed at different levels in thecomposition hierarchy.

2. Multiple Reuse: A class must be able to reuse code from more than onecomponent. For example, class BankAccount has three general characteris-tics, and must be able to reuse them.

3. Repeated Reuse: Because a class can reuse multiple components of thesame kind, it must be able to reuse a component more than once. For exam-ple, class Person has three bidirectional associations.

4. Renaming: Renaming is required to solve name conflicts caused by re-peated reuse, give the reused methods a meaningful name in the contextof the reusing class, and merge features. Name conflicts will occur becausecomponents can be reused more than once by a single class. For example, themethod names of the three associations of Person will clash. In addition, thegeneral purpose names of components are rarely appropriate for the reusingclass. For example, getOtherEnd is an inappropriate name for the methodto obtain the owner of an account.

5. Parametric Polymorphism: Parametric polymorphism is required to cus-tomize the types of the return types, fields, and formal parameters in thecomponent. For example, the type of the owner of a bank account must bePerson. This customization cannot be done safely1 in a modular way withoutparametric polymorphism.

Expressivity Features

These features reduce the amount of work needed to reuse a component. Withoutthem, most of the savings achieved by reuse is nullified, as shown in the casestudy in Section 3.5.3. The impact of a feature is shown using big O notation asactivity : Owithout → Owith. It shows the amount of work required for an activitywithout and with that feature when reusing a component. Note that the activitiesare not independent of each other. M is the number of methods in the component,F the number of fields. Ms and Fs are the number of methods and fields exportedin the interface of the reusing class, Mns and Fns the number of non-exportedmethods and fields. DM is the number of method dependencies of the component.The required work of some features is explained further on in this paper, and is

1In our view, safe also excludes type casts.


denoted with ‘. . . ’ for now. Note that Fs +Fns = F ,Ms +Mns = M, and usuallyFs 6 F ≪ Ms < M.

6. State Reuse: Declaring fields: O(F) → O(1) Reusing the state of a compo-nent prevents a lot of duplication. For example, the state of an association isalmost always a simple reference. It makes no sense to force a developer toseparately provide that state every time he uses an association component.

7. Interface Reuse: Constructing interface: O(Ms + Fs) → O(1) Similarly,reusing the interface of a component prevents duplication of its signatures.Aside from the exact method names and types, which can be configuredusing renaming and type parameters, the signatures of the reusing class arethe same as those of the component interface.

8. Selective Interface Reuse: Resolving conflicts: O(M+F) → O(Ms+Fs)A developer will usually expose only a part of the component interface basedon the intended use of the reusing class. Exposing its entire interface makesthe reusing class harder to understand if the component has a lot of func-tionality. In addition, it can cause a large amount of name conflicts thatmust be solved, even if the involved methods and fields are not relevant inthe context of the reusing class.

9. Powerful Selection: Selecting exported methods/fields: O(Ms + Fs) →O(. . .) Being able to select which methods and fields are exported in theinterface of the reusing class is not enough. If hiding or selecting is doneindividually for each method, it requires too much work.

10. Default Separation: Separating components: O(Mns + Fns) → O(1) Bydefault, components – and thus their methods and instance variables – mustbe separated, since that is how they are typically used. For example, themethods and fields of the association components of Person must be keptseparate. Separating all methods manually is error-prone and requires sepa-ration of non-selected methods.

11. Mass Renaming: Renaming: O(Ms + Fs) → O(. . .) Many componentshave patterns in the names of their methods. For example, the methods forassociations are typically named getX, setX, isValidX, containsX, . . . . Ifsuch a pattern can be exploited, all of its occurrences can be replaced usinga single declaration.

12. High-level Dependencies: Resolving method dependencies: O(DM) →O(. . .) Some components depend on methods of other components. For ex-ample, the owner component of BankAccount and the accounts compo-nent of Person need each other’s getX,setX,isValidX,. . .methods to keepthe association in a consistent state, but they do not know the final names


of these methods. Resolving these dependencies individually is tedious anderror-prone. In addition, if additional dependencies are added between twocomponents, all classes that reuse them must add additional wiring code.These problems are solved if such dependencies can be resolved at a higherlevel of abstraction. By directly connecting components to each other, all de-pendencies between them are resolved at once, and additional dependenciesrequire no additional wiring code. It is also necessary to be able to connecta component to several other components, which may have a common type.In the example, the graph component requires two association componentsfor the incoming and outgoing edges. In the formula, DM is the number ofmethod dependencies of the reused component.

Completeness Features

The following features increase the amount of functionality of a component thatcan be reused.

13. Reuse of Hidden Functionality: Methods that are not exposed in theinterface of the reusing class to prevent conflicts and interface bloat, maystill be valuable to clients. They should still be reusable, unless the developerexplicitly forbids clients to access them. Examples are advanced iterationmethods for associations.

14. Reuse of Component Type: If an object cannot somehow be used asif it were of the type of one of its components, certain methods cannotbe reused. For example, class BoundedValue has a method to transfer theremaining value to another BoundedValue. If that method cannot be usedto transfer the remaining money from one bank account to another, it mustbe duplicated, even though all necessary methods and fields are available inBankAccount. But if the bounded value component of BankAccount can beused as a real BoundedValue, the transfer method can be reused.

Methodological Features

The following features increase understandability.

15. Reuse Without Subtyping: Mandatory subtyping causes confusion incase of repeated reuse, and it does not make sense for most components. Forexample, class BankAccount is no bidirectional or unidirectional association,or a bounded value. Similarly, class Person is not three times a bidirectionalassociation.

16. No Surprises: The mechanism must never automatically resolve a nameconflict unless one of the candidates overrides all others. Otherwise, methods


are overridden based only on the form of their signature, causing unexpectedbehavior at run-time [Sny86]. A good reuse mechanism exposes such errors,instead of hiding them.

Applicability Features

The last set of features concerns the applicability of the reuse mechanism. Theyallow the reuse of a component even if it was not anticipated.

17. No Separate Concept: If a developer needs to reuse a certain class asa component, he must be allowed to do so, even if the original developerdid not anticipate such reuse. In addition, it must be possible to instantiatenon-abstract components. For example, there is no reason to complicate thecreation of an object that represents a bounded value. If components andclasses are the same, they are not limited to a single kind of reuse.

18. Default Dynamic Binding: If dynamic binding is not the default pol-icy, programmers must explicitly annotate many methods to make a classextensible. Otherwise, it may be impossible to extend and thus reuse theclass.

19. Override State: If the state of a component is not appropriate for thereusing class, e.g. because it can be computed, it must be possible to overridethe state. Otherwise, that class cannot reuse the component.

20. Merge State: The state of components can overlap in the context of thereusing class. But if the overlapping parts cannot be merged, the componentscannot be reused. For example, if a class has two values lying within the samelimits, and there is no specific component offering such behavior, it must bepossible to use two BoundedValue components and merge their upper andlower limits.

3.2.2 Existing Reuse Mechanisms

Figures 3.3 and 3.4 show the features that are supported by different reuse mech-anisms. For languages with a separate code inheritance relation, we used thatrelation in the table. For the other languages, the standard inheritance relation isused. The mechanisms are discussed in more detail in the related work in Section3.7. Note that not all languages discussed in the related work section are presentin the table because the documentation was not always complete. In addition, alllanguages with a linearizing inheritance mechanism are presented under the name‘Mixins’.

The hollow circle in the column of abstract data type component means thatCaesarJ can support the construction of abstract data types in the same way asScala, but that goes against the standard practice in CaesarJ. In the standard


1.Com

posit

ionof

ADTs

2.Multip

leReu

se

3.Rep

eatedReu

se

4.Ren

aming

5.Pa

rametric

Polymorph

ism

6.St

ateReu

se

7.InterfaceReu

se

8.Se

lectiveInterfaceReu

se

9.Po

werful S

electio

n

10. D

efau

ltSe

paratio

n

11. M

assRen

aming

12. H

igh-leve

l Dep

ende

ncies

13. R

euse

Hidde

nFu

nctio

nality

14. R

euse

ofCom

pone

ntTyp

e

Delegation • • • • • • • • • • • • • •Eiffel • • • • • • • • • • • • • •Reppy Traits • • • • • • • • • • • • • •Traits • • • • • • • • • • • • • •Cecil • • • • • • • • • • • • • •C++ • • • • • • • • • • • • • •Diesel • • • • • • • • • • • • • •Mixins • • • • • • • • • • • • • •Scala • • • • • • • • • • • • • •CaesarJ • • • • • • • • • • • • •Java • • • • • • • • • • • • • •C# • • • • • • • • • • • • • •

| z | z | z

Req

uired

Expr

essiv

ity

Com

pleten

ess

Figure 3.3: Feature matrix for different code reuse mechanisms.

approach of CaesarJ, the components do not contribute to the abstract data typeof the composition, which is often empty.

The column of default dynamic binding for the delegation technique containsa question mark because it depends on the used programming language.

For delegation, the major problem is that the interface of a component cannotbe reused. Every method must be redefined in the reusing class to invoke thecorresponding method on the delegatee. The case study shows that this is a bigdisadvantage. Whether or not state can be overridden or merged depends on theprogramming language.

The inheritance techniques – with or without subtyping – have poor support forthe required features, and very poor support for the expressivity and completenessfeatures. Only two mechanisms support the minimal requirements, and certainimportant expressivity and completeness features are either not supported at all,or only in a limited way. In the columns of features that save of lot of work, thereis a big gaping hole.


15. R

euse

With

outSu

btyp

ing

16. N

oSu

rpris

es

17. N

oSe

parate

Con

cept

18. D

efau

ltDyn

amic

Binding

19. O

verrideSt

ate

20. M

erge

State

Delegation • • • ? • •Eiffel • • • • • •Reppy Traits • • • • • •Traits • • • • • •Cecil • • • • • •C++ • • • • • •Diesel • • • • • •Mixins • • • • • •Scala • • • • • •CaesarJ • • • • • •Java • • • • • •C# • • • • • •

| z | z

Metho

dology

App

licab

ility

Figure 3.4: Feature matrix for different code reuse mechanisms, contin-ued.

Our inheritance mechanism supports all the features, and makes the imple-mentation of the entire application as big as the traditional implementation ofBankAccount.


ComponentClause:AccessMod? component Type Config?

Config:Name? CompParams? ConfigBlock?

Name:AccessMod? Identifier

CompParams:“(”Identifier (, Identifier)* “)”

ConfigBlock :“[” ConfigClause (, ConfigClause)* “]”

ConfigClause:Identifier = Identifier?override “” IdentifierList “”undefine “” IdentifierList “”export AccessMod “” IdentifierList “”direct “” IdentifierList “”indirect “” IdentifierList “”

Figure 3.5: Grammar for component relations.

3.3 The Component Relation

The component relation is a code inheritance relation for easily reusing existingcomponents in a new class. To simplify the customization of general componentsfor use in a class, the relation offers a number of new features which are explainedfurther on in this section. We introduce renaming parameters for mass renam-ing the methods of the component. We then turn the component relation into afirst-class relation. The relation can be given a name, which can be used to ac-cess non-selected functionality, use components as separate objects, and resolvedependencies on a high level. These features allow programmers to work easilywith components on a high level of abstraction instead of implementing them withlow-level code.

Using the component relation, the banking application of Figure 3.1 can be im-plemented by using a component relation for each component. This is illustrated inFigure 3.6 for the class of bank accounts. The component relations state that theclass of bank accounts has a component named owner that behaves like a bidirec-tional association with multiplicity 1, a component named balance that behaveslike a bounded value, and a component named accountNumber that behaves likea unidirectional association. The assignments are used for renaming, and in thiscase rename many methods at once by assigning values to renaming parameters.The owner component is connected to the component at the other end of the bidi-rectional association by passing the name of the other component (accounts) tothe relation. Finally, the setter method for the account number is made private.

3.3.1 Syntax

Figure 3.5 shows the syntax of the component relation. It consists of the keywordcomponent followed by the name of the inherited class, including any type param-eters. There can optionally be a name, component parameters, and a configurationblock. The access modifier of the relation determines if the type of the componentis visible to the client, which provides valuable information about its behavior.

3.3 The Component Relation 65

component BidiAssociation-1-Side<Account,Person> owner (accounts) [X=Owner]component BoundedValue<long> balance

[X=Balance,LOW=LowerLimit,HI=UpperLimit];component UniAssociation<int> accountNumber

[X=AccountNumber, export private setAccountNumber]

Figure 3.6: The component relations of BankAccount.

The access modifier of the name determines if he can use the name of a visiblecomponent relation to access it as a separate object or resolve dependencies. Theconfiguration block is similar to that of Eiffel. The assignment is used for renamingwhich is further explained in Section 3.3.3, override if a feature2 is overridden,undefine to undefine a feature in case features are merged, and export for chang-ing the visibility of a feature. The inheritance name, component parameters, anddirect and indirect clauses are discussed in Section 3.3.4.

3.3.2 General Semantics

The component relation is essentially a code inheritance relation. If A has a com-ponent relation with B, A inherits the features of B, but not its type. For example,the class of bank accounts inherits all features of a bounded value, but a bankaccount is no bounded value.

Due to the lack of subtyping, it is allowed for a class to have a componentrelation with a final class T. After all, the same can be achieved by manuallydelegating every method to an object of type T. The features of T, however, cannotbe overridden in the inheriting class. You can reuse, but not override, the code, sothe essence of the final modifier is respected.

Despite the absence of subtyping, however, both methods and instance vari-ables3 must conform to all features they override, because the methods of theinherited class expect them to behave according to their original signatures andcontracts.

If a feature is inherited via different inheritance paths, a choice must be madeto decide if the feature is inherited once, or multiple times. The default policyfor features inherited via a component relation is duplication because, generally,the components do not overlap. This means that if a feature is inherited via acomponent relation and again via another inheritance relation, there is a conflict,even if the definitions are the same. This conflict must be resolved explicitly, e.g.via merging or renaming. As a result, features inherited via a component relationdo not participate in the rule-of-dominance used for subtyping in Section 3.4.2. InSection 3.4.2, we discuss why duplication is forbidden for subtyping. To avoid anexplosion of the number of renaming clauses, we introduce renaming parameters

2The features of a class are its instance variables and methods.3Instance variables are properties that, unless declared final, can be overridden and merged.


in Section 3.3.3 and indirect inheritance in Section 3.3.4.

As in Eiffel, binding of features in inherited methods is done within the inher-itance relation through which they are inherited. This is required to allow separa-tion of the components. For example, CheckingAccount inherits the getter methodof BidiAssociation-1-Side twice: once for the association with the owner, andonce for the association with the bank card. Both getters must of course use theinstance variable of their own component.

3.3.3 Renaming Parameters

Without intervention, using duplication as the default for the component relationforces a developer to explicitly rename almost every method of the component.The case study in Section 3.5.3 shows that renaming is a significant problem. Weintroduce a lightweight macro system to minimize the effort of renaming features.

The names in the features of characteristics often exhibit patterns. For exam-ple, the names of the methods of the N side of an association are getX, addX,removeX, replaceX, containsX, . . . . To avoid these patterns from getting lostin the implementation, we introduce renaming parameters. They can be writtenin the names of non-private features, and allow an inheriting class to rename anumber of features with a single renaming declaration.

A renaming parameter is a parameter of a class and is written between squarebrackets. It can be given a default value; otherwise its name serves as the defaultvalue. The parameter can be used in feature names by writing its name between% characters. An inheriting class can assign a value to the parameter in the con-figuration block of the inheritance relation. The value of the parameter can be anystring that is valid for all feature names containing the parameter – which are allvisible to the inheriting class.

Figure 3.7 illustrates the use of renaming parameters. Parameter X is usedas the name of the other end of the association and is initialized to the emptystring. Parameter XS represents the plural of X and by default equals the value ofX appended with an ‘s’. For the children component of Person both parametersare assigned because the default value of XS is not appropriate.

We can now determine the amount of work required for renaming. Ps is thenumber of renaming parameters in the selected features. Ms,np the number ofselected methods without renaming parameters, and Fnp the number of fieldswithout renaming parameters. The impact of renaming parameters is O(Ms +Fs) → O(Ps + Ms,np + Fs,np).

The name of the renaming parameter itself must be unique for that class butmay be equal to a renaming parameter of a parent class. The name must also bedifferent from the names of the methods in that class to avoid ambiguities in theconfiguration blocks of subclasses. In addition, no conflicts may occur when eachrenaming parameter is assigned its default value.

To invoke a feature whose name contains a renaming parameter, it must be


class Association<FROM,TO> . . . [X] boolean isValid%X%(TO x);boolean contains%X%(TO x);. . .

class BidiAssociation-N-Side <FROM,TO> . . . [X=,XS=%X%s]

subtype Association <FROM,TO> [X=%X%]

Set<TO> get %XS% . . .

void add%X%(TO x) . . .

void remove%X%(TO x) . . .

void replace %X%(TO x, TO y) . . . . . .

class Person

component BidiAssociation-N-Side<Person, BankAccount> . . . [X=Account]

component BidiAssociation-N-Side<Person, Person> . . . [X=Parent]

component BidiAssociation-N-Side<Person, Person> . . . [X=Child,XS=Children]

. . .

Figure 3.7: Using renaming parameters.

accessed using the most specific value of that parameter known at the site where itis accessed. In the body of the class that declares the parameter, this is the defaultvalue.

In order to propagate the value of a renaming parameter to superclasses, theparameter can be used in the right-hand side of the assignments of a configura-tion block. To avoid conflicts when a class uses a renaming parameter with thesame name as a renaming parameter in the superclass, parameters in the left-hand side of the assignment are resolved in the superclass, while parameters inthe right-hand side are resolved in the current class. For example, in Figure 3.7,class BidiAssociation-N-Side uses a renaming parameter X while the parentclass has a renaming parameter with the same name. In the configuration blockof BidiAssociation-N-Side, the X in the left-hand side of the assignment ref-erences the renaming parameter of the parent class Association, while the X inthe right-hand side references the renaming parameter of the current class. As aresult, renaming parameter X of Association is given the same value as renamingparameter X of BidiAssociation.

Methods whose name contains a renaming parameter can be renamed both byassigning a value to a renaming parameter and by renaming the method directly. Inorder to avoid ambiguities, explicitly renaming an individual method has priorityover the renaming that would be done by an assignment to a renaming parameter.The method must be renamed as if the renaming parameter had its default value.


class BankAccountcomponent BidiAssociation-N-Side<BankAccount,BankCard>. . .

[X=BankCard, addX=attachBankCard]

Figure 3.8: Priority of renaming.

class BankAccountcomponent BidiAssociation-1-Side<BankAccount,Person> owner . . .. . .

class Personcomponent BidiAssociation-N-Side<Person,BankAccount> accounts . . .. . .

Figure 3.9: First-class component relations.

The order in which both declarations are written in the configuration block isirrelevant.

Figure 3.8 illustrates the priority during renaming. Class BankAccount

now uses the renaming parameter X to rename all methods inherited fromBidiAssociation-N-Side except for the addX method, which is individually re-named to attachBankCard.

Renaming parameters cannot be filled in during object construction. If the classof which an object is being constructed has renaming parameters, the parameterwill assume the default value. Because the object can only be used using the typeof its class, or one of its super classes, the renaming could never be visible anyway.

3.3.4 First-Class Component Relations

In this section, we introduce first-class component relations to solve a number ofproblems. We use them to connect components without resolving every individualdependency, to access functionality that is not exposed in the interface of thereusing class, and to use components as if they were separate objects.

A component relation can have a name, which typically represents therole of the component in the reusing class. Figure 3.9 illustrates this forclasses BankAccount and Person. The BidiAssociation-1-Side component ofBankAccount is named owner, and the BidiAssociation-N-Side component ofPerson is named accounts.


Direct and Indirect inheritance

As presented in Section 3.2, selective reuse of the interface of a component isrequired for two reasons.

First, it prevents interface bloat in the reusing class. Take for example theassociation components. To maximize code reuse, it is best to put many featuresin the association classes. Examples include applying some action to all referencedelements, a universal and an existential quantifier, accumulation, and validation.But for an inheriting class, this means that either its interface gets bloated, or itsdeveloper must do a lot of work to hide the functionality, preventing reuse.

Second, because not all method and field names use renaming parameters, thereare still many name conflicts. For example, features like equals and hashCode inthe top-level class cause conflicts in every component relation. But if these featuresare not interesting in the inheriting class, which is usually the case, the developershould not have to resolve their conflicts.

To solve both problems, we make a distinction between directly and indirectlyinherited features. A directly inherited feature is present in the interface of theinheriting class, while an indirectly inherited feature is not. As a result, a directlyinherited feature can cause.

An indirectly inherited feature, however, can still be accessed if the componentrelation has been given a name and is visible to the client. The feature can thenbe invoked as myObject.inheritanceName.feature using its original name. It isas if the component is an object referenced by a field in the reusing class. Thisway, the client resolves the conflict by using the name of the component relation.It is, of course, the responsibility of the programmer to give the reusing class ameaningful interface. Using inheritance names to access features must not be thestandard way of using a class.

Figures 3.10 and 3.11 illustrate this for class Person. For the children compo-nent, only the add, remove, and get methods are inherited directly. The parents

component additionally inherits the replace and isValid methods. The othermethods must be invoked indirectly via the name of the inheritance relation. Notethat the invocations of children.add and addChild in Figure 3.11 are identicaleven if the method has been overridden in Person.

The inheriting class must specify which features are inherited directly. This isdone in the configuration block either by including them with a direct declaration,or by renaming or overriding them. All other features are inherited indirectly.

To facilitate selecting directly inherited features, the features of a class can beput in groups as in Eiffel, Smalltalk, and C# (using the #region directive). Thisway, inheriting classes can directly inherit an entire group of methods with littleeffort. For example, the basic functionality of a class can be put into a single groupwhile more advanced functionality can be put in other groups. To select whichfeatures or groups are inherited directly, the programmer can use direct andindirect declarations in the configuration block of the component relation. They


Figure 3.10: Indirect Inheri-tance.

Person sandra = . . . ;Person bruno = . . . ;Person kato = . . . ;

// two identical method callssandra .children.add (kato);sandra .addChild (kato);

bruno.addChild(kato);kato.parents.applyTo(. . . );

Figure 3.11: Using indirectly inher-ited features.

class BidiAssociation-N-Side<FROM,TO> . . . [X,XS=%X%s]boolean equals(Object other) . . . int hashCode() . . . group default

Set<TO> get%XS% . . . void add%X%(TO x) . . . void remove%X%(TO x) . . . void replace%X%(TO x, TO y) . . .

group iteration

filter%XS%(. . . ) . . . applyTo%XS%(Command<TO>) . . . ...

class Personcomponent BidiAssociation-N-Side<Person,Person> children (parents)

[X=Child,XS=Children, indirectreplaceChild ]

component BidiAssociation-N-Side<Person,Person> parents (children)

[X=Parent, directisValidParent ]

. . .

Figure 3.12: Selecting directly inherited features.


can be used with both feature groups and individual features. For an individualfeature, the final name of that feature – after performing the renaming – mustbe used. A feature is inherited directly if it is listed in a direct declaration, andindirectly if it is listed in an indirect declaration. If a feature is not listed insuch a clause, it is inherited directly if they are part of a group that is listedin a direct declaration, and indirectly if its are part of a group that is listedin an indirect declaration. Every component relation implicitly has a direct

declaration for the group named default. To avoid ambiguities, group names areput in the same namespace as method and field names. The mechanism can bemade more flexible, but that is not in the scope of this thesis.

Figure 3.12 illustrates the use of the selection mechanism. Both the children

and parents components inherit the get, add, and remove methods, which are inthe default group. The children component of Person excludes replaceChildfrom the direct interface since it is barely useful in that context, and would oth-erwise be inherited since it is in the default group. The parents componentadditionally includes the isValidParent method, which may contain conditionsfor the adoption of a child, in its interface.

The impact of indirect inheritance is O(Ms +Fs) → O(Gs +Ms,ng +Mns,g +Fs,ng + Fns,g) with Gs the number of selected groups, Ms,ng and Fs,ng the se-lected methods and fields not in such a group, and Mns,g and Fns,g the unwantedmethods and fields in the selected groups.

Component References

Using indirect inheritance, the features of a component can be accessed as if thecomponent were an object referenced by a read-only instance variable. To alloweven more reuse, we allow the name of a component relation to be actually used asa reference to the subobject representing that component, similar to casts in C++[Str91]. Because we already require conformance between the actual componentand the inherited class, type-safety is not endangered.

Component references make it possible to reuse methods with a formal parame-ter that has a type that is used as a component. For example, the class representingbounded values has methods to compare it with another bounded value, to trans-fer the remaining value to another bound value. Another example is the equals

method for an association, which takes a similar association as its argument, toverify if two association reference the same elements. Without component refer-ences, these features cannot be reused if the class is used as a component foranother class because there is no subtyping relation between the reusing class andthe component.

Figure 3.13 shows how such methods can be reused using component refer-ences. The methods cannot take a Person or BankAccount as an argument, butby using the names of the component relations, the components can be passed tothe method.


boolean eq = sandra.children.equals( bruno.children );yourAccount.balance.transferRemainingValueTo( myAccount.balance );

Figure 3.13: Using component references.

Being able to use component references has an influence on how this is treatedin the context of a component relation. When a class A is reused through a com-ponent relation by class B, its this reference acts as if it were substituted bythis.inheritanceName. Otherwise, this would have type A while referencing anobject of type B, which is not type-safe since B is not a subtype of A.

Consequently, a component cannot use the this reference to obtain a referenceto the object of the reusing class because that is a reference to the subobject forthat component. For example, the components for bidirectional associations needan object of type FROM – a type parameter – to pass it to the other end of theassociation. Various techniques can be used to obtain that reference, e.g. storingit explicitly in a field, self types as used in Eiffel, or a variant of the self types inScala. More details on techniques to obtain a reference to the object of the reusingclass can be found in the technical report [vDS06].

If features of a component are made inaccessible by making them private

using an export clause, it is important to prevent this from leaking out of thecomponent, and to prevent the use of component references for that componentto access hidden features. A technique similar to confined types and anonymousmethods [VB01] can be used to prevent leaking of this from the component. Toprevent access via a component reference, such a reference can either be forbiddencompletely if features are hidden, or it can be given an anonymous type that doesnot provide the hidden features. Providing a complete solution for confinementand access control was outside the scope of this thesis.

Dependency Resolution

Some components depend on methods of other components. Examples are themethods to set up and break down bidirectional associations, as shown in Figure3.14. The setOwner method of BankAccount must know which register method

Figure 3.14: Low-level dependen-cies.

Figure 3.15: High-level depen-dencies.


to invoke on the Person to keep the association consistent. The registerOwner

method needs an unregister method to correctly remove the old back-pointer,and getAccounts is needed in the specification of setOwner. Because Person hasmultiple associations, these dependencies cannot be resolved automatically. Thedeveloper of BankAccountmust connect these methods to the appropriate methodsin Person. With existing inheritance mechanisms, this must be done with wiringcode for each individual method dependency.

To resolve these dependencies more elegantly, we use the names of the com-ponent relations. Figure 3.15 illustrates the approach. The owner component ofBankAccount and the accounts component of Person are connected by resolvinga single high-level dependency on each side.

To specify high-level dependencies, a class can declare formal component param-eters . They are declared after the type parameters of a class between parentheses,and have the form T → C cparam. In this declaration, T → C is a constraint on thecomponent relation passed through the parameter. T is the type containing therelation, and C is the target type of the relation. Finally, cparam is the name ofthe parameter.

Figure 3.16 illustrates the declaration of a component parameters. The for-mal parameter expects the name of a relation that a) is a relation of the class atthe other side of the association (TO), and b) is a BidiAssociation representingan association in the opposite direction (from TO to FROM). Take for example thecomponent relations of BankAccount and Person in Figure 3.17. If we substitutethe type parameters, we see that component relation owner requires the name ofa component relation with type BidiAssociation<Person,BankAccount> thatis contained in Person. Since the accounts component of Person satisfies theseconstraints, we can connect the owner component to the accounts component.Similarly, the owner component satisfies the constraints of the accounts compo-nent. Consequently, the owner component of BankAccount can be connected tothe accounts component of Person, and vice versa. Note that BidiAssociation1does not require the components to be mutually connected. Such constraints arenot in the scope of this thesis.

A component parameter can be used to invoke features of the actual compo-nent passed through the parameter on objects of the type containing the com-ponent. Method invocations and field accesses are performed using the followingexpressions: expr@ cparam.m(args) and [email protected]. If cparam has T → C

as constraint, expr must be of type T, and m or f must be applicable to type C. Inthe context of a component relation where actual component parameter aparam

is used, cparam is replaced with aparam. As a result, method aparam.m(args) orfield access aparam.f are invoked on the result of expr. Note that any renamingor overriding of these features in the run-time type of expr is taken into account.We use a symbol different from the dot to emphasize the difference with a regularinvocation. In addition, this avoids confusion about the meaning of expr.cparam


class BidiAssociation-1-Side<FROM,TO>

(TO → BidiAssociation<TO,FROM> otherEnd)

subtype BidiAssociation<FROM,TO> (otherEnd)

private TO other;public void setX(TO other)

. . [email protected] (expression for the object on this side of the association);. . .

protected void register(TO other) . . . . . .

Figure 3.16: Component parameters.

class BankAccount

component BidiAssociation-1-Side<BankAccount,Person> owner (accounts) . . .

. . .

class Person

component BidiAssociation-N-Side<Person,BankAccount> accounts (owner) . . .

. . .

Figure 3.17: Using high-level dependency resolution.

if a feature with name cparam is added to T.

The setX method in Figure 3.16 shows how the component parameter is usedto invoke methods. The invocation of register is applied to the otherEnd com-ponent of other. The method that will be invoked, is the register method of theactual component relation passed through the parameter, which may be overrid-den or renamed in the actual class TO. In the example of Figure 3.17, the setOwnermethod inherited by BankAccount will invoke the registerAccount method in-herited by Person.

Currently, component parameters are not part of the type of a class, unlike typeparameters. Consequently, A(someComponent) and A(otherComponent) have thesame type: A. Incorporating component parameters in the type of a class togetherwith wildcards remains future work.

This approach has a number of advantages. First, it saves a lot of work byreplacing the individual dependencies with a smaller number of high-level de-pendencies. The impact is O(DM) → O(DC) with DC the number of componentdependencies and DC 6 DM. Second, it ensures that the required methods are pro-

3.4 The Subtyping Relation 75

vided by a single component and not by methods of different components, which iscrucial in this example. Third, if additional dependencies are added between twotypes of components, the reusing classes need no modifications. For example, wecan add an isSibling method to BidiAssociation to check if some object is itssibling. This method would invoke the contains method on the other end of theassociation, introducing another dependency. Inheriting classes, however, do notneed to be modified.

Visibility

By default, component relations are public because they are typically used for thecharacteristics of a class. A public client can see their name, type, and configu-ration. The reason for this default choice is understandability. If a programmerknows the behavior of class C, he also knows the behavior of a component of typeC. But if the relation is not visible, he must study the contracts of the inheritedfeatures again in order to understand their behavior. If the component relation isused for traditional code inheritance, e.g. to implement a Stack using an Array, itshould be hidden from the client. More details about the visibility of componentrelations, and how to change them is in the technical report [vDS06].

3.4 The Subtyping Relation

The introduction of the component relation also has an impact on the subtypingrelation, which is based on the subtyping relation of Eiffel. First, the subtypingrelation can be simplified and tailored for classification because it must no longerbe used for pure code reuse. Second, by making the component relation first-class,the need arises to rename, override, and merge component relations in a subtypinghierarchy.

3.4.1 Syntax

Figure 3.18 shows the syntax of the subtyping relation. It consists of the keywordsubtype followed by the name of the super type, including any type parameters.There can optionally be a name for the relation, and a configuration block. Thecomponent parameters are used to transfer compent parameters to the superclass,similar to type parameters. For example, in the hierarchy of association classes,the parameter for component at the other end of the association must end up inBidiAssociation, as shown in the example subtyping relations in Figure 3.19. Toensure consistency, component parameters passed to the same class via differentsubtyping relations must be identical. This is similar to the rule for type parametersin Java and Eiffel.


SubtypeClause:subtype Type Identifier? CompParams? ConfigBlock?

Figure 3.18: Grammar for the subtyping relation.

class Association<TO>. . .

class BidiAssociation<FROM,TO>(TO -> BidiAssociation<TO,FROM> other)subtype Association<TO>. . .

class BidiAssociation-N-Side<FROM,TO>(TO -> BidiAssociation<TO,FROM> other)

subtype BidiAssociation<FROM,TO> (other)

. . .

Figure 3.19: Subtyping relations.

3.4.2 General Semantics

The subtyping relation has the same meaning as in Eiffel. The inheriting classinherits both the implementation and the type of the inherited class.

Just as in SmartEiffel and Cecil, duplication of features that are inheritedvia a subtyping relation is forbidden because duplication of features is confusingfor classification. As a result, we avoid the diamond problem for subtyping. Forexample, it makes perfect sense that a Zebra is a special type of Horse, but being1.7 times a Horse does not make sense. An object is either a horse, or it is not ahorse. In more technical terms, if duplication is allowed, only one of the duplicatedfeatures can be used on an object of type Horse. The other feature is actually anew feature that is introduced in Zebra. This should be done via code inheritance,not subtyping. As a consequence, features with the same origin that are inheritedvia subtyping relations must be given the same name.

For the subtyping relation, we use a rule-of-dominance like in C++ [Str91]. Ifone definition of a feature inherited via subtyping overrides all others, that defini-tion is inherited and there is no conflict. Since the subtyping relation is nominal,and conformance is enforced, behavioral subtyping is preserved. For overridingmethods, the standard conformance rules apply. For overriding instance variables,which are properties, the type must be preserved.

As with the component relation, inherited features can be renamed. A firstuse of renaming for classification is providing better names for features inherited

3.4 The Subtyping Relation 77

Figure 3.20: Overriding components.

from general classes. For example, the getParent method in the top-level class ofour metamodel can be renamed to getMethod for the class Implementation. Inaddition, renaming is required to solve conflicts and merge similar features wheninheriting from independently developed classes. Without renaming, reusability ofsuch classes is severely decreased. In addition to method and fields, componentrelation can also be renamed. Indirectly inherited features are then accessed usingthe new inheritance name.

Features can be merged by giving them the same name. This is needed to inheritfrom independently developed classes that share a concept. To choose a definition,either a new one can be provided, or an existing definition can be selected byundefining the others. The final feature must of course conform to all inheritedfeatures. Merging of component relations is discussed in Section 3.4.3. Contraryto Eiffel, we do not forbid merging instance variables. Method groups [SG95] ordata groups [Lei98] can be used to prevent separating dependent instance variables[Sak89] by e.g. duplicating only one of them. This topic, however, is not in thescope of this thesis.

For constructors, either the approach of C++ or Eiffel/Smalltalk can be chosen.In C++, the most specific class must invoke both the constructors of the directsuper classes, and the constructors for every virtually inherited class. In Eiffel andSmalltalk, a constructor is a regular method without special requirements.

3.4.3 Overriding and Merging Components

For the same reasons why overriding and merging of state is required to ensure thata component can always be reused, as discussed in Section 3.2, it must also be pos-sible to override and merge component relations. Overriding is illustrated in Figure3.20. The BidiAssociation-1-Side’ component in B overrides the Association’


Figure 3.21: Overriding components.

component in A.Only the subtyping relations represented by the solid arrows andthe component relations represented by the dotted arrows are in the code, thedashed subtyping relations are implicit. For merging, either an existing compo-nent must be selected, or an overriding component must be defined.

In both cases, the overriding or selected component must satisfy two rules.First, standard subtyping conformance is required. The overriding component(BidiAssociation’) must not only be a subtype of the target class of thecomponent relation (BidiAssociation), but also of all overridden components(Assocation’). Because of this subtyping hierarchy of the overriding component,we again use the rule-of-dominance to minimize the number of methods that mustbe overridden in the overriding component (BidiAssociation’). Component pa-rameters passed to the overriding component type must reference components thatoverride the components passed to the overridden component type by the over-ridden component relations. Second, conformance of the component interface isrequired. This means that every feature that is inherited directly in an overrid-den component relation must be inherited directly in the overriding componentrelation. The features of an overridden component are automatically renamed tothe corresponding new names defined by the overriding component. Renaming offeatures within the type hierarchy of the overriding component is of course takeninto account. This is illustrated in Figure 3.21. If z is renamed in Component’ tox, in Subcomponent to y, and in Subcomponent’ to w, then x will automaticallybe renamed to z in the subtyping relation between A and B.

3.5 Evaluation 79

3.4.4 Reducing Hierarchy Dependencies

In [SDNB03], it is argued that super calls in class-based languages with multipleinheritance increase the dependency of code on the class hierarchy. In such alanguage, multiple methods with the same name can be inherited by a class, soin order to disambiguate super calls to such methods, they must be qualified withthe name of the direct super class containing the method that must be invoked.Examples of languages using this approach are C++, Cecil, Trellis/OWL, Eiffel,and SmartEiffel. This problem does not occur with inheritance mechanisms thatlinearize the class hierarchy, or in the prototype-based language Self [CUCH91],where super calls can be directed to a named parent slot.

These dependencies can be removed by also giving a name to a subtypingrelation. It is possible to qualify a super call using the name of that inheritancerelation instead of the name of the super class. Consequently, the call remains validif the actual super class for that relation is changed, as long as an appropriatemethod is available in the new super class. This is similar to directed resends inSelf. The name of a subtyping relation is private since only the inheriting class caninvoke super calls.For our inheritance mechanism, a super call is either unqualified,or qualified by the name of an inheritance relation.

Technically, reuse variables in the class-based language Timor also reduce thisdependency, but in their paper [KHM04], the authors do not present this insight.

3.5 Evaluation

In this section, we evaluate the complexity and the effectiveness of the proposedinheritance mechanism.

3.5.1 Methodological Discussion

The usability of a language construct cannot be determined by just looking atits own complexity. It must be judged by its influence on the entire programmingprocess. A programmer must not only know the programming language itself,but also a part of its standard library, design patterns, methodological rules, andworkarounds for the limitations of the language.

In the case of the component relation, we do not introduce new complexity,but offer a language construct to deal with existing complexity. Programmersalready deal with high-level concepts and the accompanying name patterns anddependencies, but they must do so at a low level of abstraction. By allowing themto use these concepts directly, they must no longer program them with a lot ofeffort. In addition, the meaning of theyse concepts does no longer get lost in low-level code. This makes it easier for programmers to understand a class if theyalready know some of the used components.


The separation of the language constructs for the metaphors of classificationand composition makes them easier to use and understand. For the subtypingrelation, duplication of methods and variable can be forbidden and the rule-of-dominance can be used to solve conflicts if a subtyping relation holds between theconflicting members. Using the component relation for composition allows it tooffer features that dramatically reduce the amount of work required to create acomposition, as will be shown in Sections 3.5.2 and 3.5.3. Because the semanticsof the component relation are similar to that of building a physical object by usingcomponents as building blocks, we believe that the relation is very natural to use.

Both classification and composition are suitable metaphors for graphical pro-gramming/modeling. For classification, this property is already exploited by classdiagram editors which offer a concise and clear overview of the code. For composi-tion, this property can now be exploited in the same way as is done in modern GUIbuilders. Components can be dropped in classes and connected using a graphicaleditor. Such an editor allows the example of Figure 3.1 to be programmed almostcompletely by just drawing that figure and filling in renaming parameters.

3.5.2 Example

Figure 3.22 shows the entire implementation of Figure 3.1. The names of the as-sociation classes are abbreviated for reasons of space. The implementation is donealmost completely by configuring existing components. Only the constructors areactually implemented. This is an important result, because it means that thisimplementation can be done by drawing a class diagram, and filling in the param-eters. Although the example does not contain any application specific behavior,it illustrates what can be achieved with our approach. Section 3.5.3 presents arealistic case study.

In addition, the high-level concepts of the diagram cannot get lost becausethey are directly present in the code. In current CASE tools, such concepts arelost because they are translated into low-level code, leading to synchronizationproblems. The constructors of component relations are invoked using a super callqualified by the inheritance name for reasons of clarity.

3.5 Evaluation 81

class BankAccountcomponent BoundedValue<long> balance

[Value=Balance, Lower=Credit, increaseBalance=deposit,decreaseBalance=withdraw,export private setUpperLimit,setLowerLimit,setBalance]

component Bidi-1-Side<BankAccount,Person> owner (accounts) [X=Owner]component Uni<int> accountNumber

[X=AccountNumber, export private setAccountNumber]

public BankAccount(int accountID) balance.super(0,-1000,1000000);accountNumber.super(accountID);

class CheckingAccount

subtype BankAccountcomponent Bidi-1-Side<CheckingAccount, BankCard> bankCard (account)[X=BankCard]

public BankAccount(int number)

super(number);

class Person

component Bidi-N-Side<Person,BankAccount> accounts (owner) [X=Accounts]component Bidi-N-Side<Person,Person> parents (children) [X=Parents]component Bidi-N-Side<Person,Person> children (parents) [X=Children]component Uni<String> [X=Name]component Graph<Person> family (parents,children)

public Person(String name, Person mother, Person father)

setName(name);addParent(mother);addParent(father);

class BankCard

component Bidi-1-Side<BankCard,CheckingAccount> account (bankCard)[X=Account]component Uni<int> [X=PinCode]

Figure 3.22: Implementation of the banking application of Figure 3.1.


3.5.3 Case StudyWe compared our inheritance mechanism with manual delegation, and the in-heritance mechanisms of Java, Eiffel, and Reppy traits, which support repeatedinheritance [RT06], by comparing their impact on the size of an application. Weused Jnome [vD06], our metamodel for Java, and Chameleon [vD06], our frame-work for metamodels of programming languages. Together they contain 9763 linesof Java code.

We modified the Java programs using our inheritance mechanism4, and thencalculated the size for the other techniques based on the overhead of renaming,dependency resolution, encapsulation of state, and manual delegation for eachtechnique. Note that only the inheritance relations and wiring code differ for theparticipating mechanisms. All other code is identical, so all effects are due todifferences in the reuse mechanisms.

To study the impact of the size and the nature of extensions of the components,we repeated the experiment for two kinds of extensions. In the first extension, allassociations send events when they are modified. This extension is applicationindependent because managing the listeners and invoking notify is always thesame. In the second extension, which builds on the first one, the associations alsocheck the validity of the elements. For this extension, the validity condition isapplication specific and must be overridden, while other supporting code can bereused.

We do not take specifications into account in the study. We assume that theaverage specification of the removed methods and variables equal the averages forthe entire program. Consequently, the relative gain will not differ when taking thespecifications into account.

DiscussionFigure 3.23 shows the code size for the different techniques and code bases. Figure3.24 shows the reduction in size compared to Java. Almost all of the reductionis obtained in the domain model, which takes up 70% of the software. The other30% consists of input and output algorithms.

First of all, we must note that the reduction in code size is not the same asthe reduction in complexity. Renaming clauses and manual delegation are muchsimpler than the reused methods.

Both figures clearly show that our inheritance mechanism results in a muchbigger reduction than the other mechanisms. The difference is caused by the addi-tional overhead mentioned above. Manual delegation and code inheritance in Eiffelreduce the size much less than our mechanism, but are still a big improvement overthe Java version. Using Reppy traits, however, the code size even increases. Theadditional getter and setter methods – traits cannot contain state – cause so much

4We must note that the resulting code does not currently compile because our compiler is notyet complete.

3.5 Evaluation 83

T: Reppy Traits J: Java E: Eiffel D: Delegation C: Components

7.7 k

8.7 k

9.5 k9.8 k

10.7 k

Original Events Validation

8.0 k

10.5 k

11.2 k

12.4 k

13.2 k

T

T

T

J

J

J

E

E

E

D

D

D

C CC

Figure 3.23: Lines of Code.

-9.3 %

0 %2.9 %

10.8 %

21.4 %

Original Events Validation

-6.3 %

0 %

9.9 %15.2 %

35.6 %

T

TT

E

EED

DD

C

CC

Figure 3.24: Reduction Com-pared to Java.

additional overhead that the application becomes even bigger than the originalJava application.

An important result is the impact of adding functionality that is not overrid-den in the application. Adding support for sending events requires no modificationof the version using our inheritance mechanism. The renaming parameters, com-ponent parameters, and indirect inheritance avoid the need for additional code ifall methods and variables added to the default group contain existing renamingparameters. With all other techniques, code must be added to the applications forrenaming clauses, manual delegations, dependency methods, or state encapsula-tion. The more functionality offered by the component, the more modifications arerequired by other techniques. This is a very important practical result. A first con-sequence is that the developer of a component can now add functionality withoutbreaking any client code. A second consequence is that he can now provide lots offunctionality without putting a huge burden on his clients.

Another important result shows up if validation is added to the associations,and specific validation rules are implemented in the applications. The version usingour inheritance mechanism is the only one in which less code must be added thanin the Java version, as shown by the gradients in the right part of Figure 3.24.This means that it is still beneficial to reuse small components, or small parts ofbigger components, using our inheritance mechanism. Using the other techniques,the additional overhead makes reuse unattractive in these scenarios.


P ::= L eL ::= class C (α) ST SC F K Mα ::= T → C

δ ::= α | i

ST ::= subtype C (δ) [n = o]

SC ::= component C (δ) i [n = o]M ::= C m(C x)return e;e ::= x | i | e.f | e.i | e@α| e.m(e) | new C(e) | (C)e

Figure 3.25: Syntax.

3.6 Formal Semantics

In this section, we present a part of our type system. More details can be found inAppendix B. The model is based on both ClassicJava [FKF98] and FeatherweightJava [IPW01]. Because our inheritance mechanism supports renaming, the statictype of the target is required to determine the invoked method or accessed field.We use the type elaboration of ClassicJava to incorporate that information in theprogram. The rest of the model is based on Featherweight Java because of itssimplicity.

To model the essence of our inheritance mechanism, we added multiple inheri-tance, separation of subtyping and code inheritance, named inheritance relations,component parameters, indirect inheritance, and simple renaming to the Feather-weight Java model. Other elements have been omitted to keep the model simple. Incase of a conflict, the conflicting elements must be overridden by a new definition.Because we do not model component classes, non-conformance and feature hidingare not allowed. We assume that all component relations have been given a name,and that classes with component parameters are abstract.

The syntax of the language is shown in Figure 3.25. The differences with Feath-erweight Java are the component parameters α, the two inheritance relations, theexpressions e.i for component references and e@α for invocations on componentparameters. The subtyping relation cannot have a name. Variable δ ranges overboth component parameters and inheritance names.

3.6.1 Method Lookup

In this chapter, we only present the lookup of methods, the expression typingrules, and the reduction and congruence rules. Lookup of fields is similar to thatof methods, and lookup of component relations is trivial.

A method is represented as P M, where M is the definition of the method, andP is its enclosing class. This is needed to determine the origin of a method. In-direct inheritance is modeled by giving indirectly inherited methods the name

3.6 Formal Semantics 85

E = class C(α) ST SC F K M(1)

methods(C) = methods(E)


methods(E) = C F ∪ inhst(E) ∪ inhco(E)

inhst(E = class C(α)ST SCF K M) =U N|¬N overridden in E ∧ U N ∈ methods(ST, C)∧ (3)

6 ∃ V O ∈ methods(ST, C) : V O 6= U N ∧ V O overrides U N

inhco(E = class C(α) ST SCF K M) =U N|¬N overridden in E ∧

U N ∈ methods(SC, C)(4)

class T(α) ... methods(T) = P M(5)

methods(subtype T(δ) [n=o], C) = τst(δ, n = o, α, C, P M)


methods(component T(δ) i [n=o], C) = τco(T, δ, i, n = o, α, C, P M)

m ∈ n(7)

τst(δ, n = o, α, C, P B m(B x) return e;) =

C B [o/n]m(B x) return τ (δ, α, e);

m 6∈ n . 6∈ m(8)


P B m(B x) return τ (δ, α, e);

m 6∈ n m = head.tail . 6∈ head head 6∈ n(9)



m 6∈ n m = head.tail . 6∈ head head ∈ n(10)


C B ([o/n]head).tail(B x) return τ (δ, α, e);

m ∈ n(11)

τco(T, δ, i, n = o, α, C, P B m(B x) return e;) =

C B [o/n]m(B x) return [this:C.i/this:T]τ (δ, α, e);

m 6∈ n(12)


C B i.m(B x) return [this:C.i/this:T]τ (δ, α, e);

δ = T → C (13)τ (δ, α, e) = [@δ/@α]e

δ = i (14)τ (δ, α, e) = [.δ/@α]e

Figure 3.26: Methods of a class.


methods(T) = V N name = ni U M ∈ methods(C)

U M overrides Vi Ni ∨ U M same as Vi Ni (15)method(name, T, C) = M

name = head.tail

. 6∈ head

component(head, T) = X component Y(δ) i [n=o]

N = B i.tail(B x) ...

U B tail(B x) ... ∈ methods(Y)

(U M overrides X N ∨ U M same as X N)

U M ∈ methods(C)(16)

method(name, T, C) = M

Figure 3.27: Method lookup.

inheritanceName.m.

Figure 3.26 shows the definition of the methods function. Rules 1 and 2 aretrivial. Rules 3 and 4 determine which methods are inherited by the subtyping andcomponent relations respectively. The difference between both functions is thatinhst incorporates the rule-of-dominance. It ignores definitions that are overriddenby a definition inherited via another subtyping relation. In addition, if the samedefinition is inherited more than once via subtyping with different names, thetype rules demand that all versions are given the same name. This makes themsyntactically equal, after which the set definition merges them. This is not the casefor the component relation, where duplication is the default policy. Rules 5 and 6determine which methods can be inherited via a specific subtyping or componentrelation. They take the methods of the inherited class, and apply the renaming,and substitution of component parameters by using τst and τco. These functionsare defined in rules 7-10 and 11-12 respectively. The τst function is divided in fourparts. In rules 8 and 9, no renaming is done. Rule 7 deals with the renaming of anindividual method, while rule 10 deals with renaming as a consequence of renaminga component relation. Note that the latter rules change the parent class of themethod to the inheriting class. Rules 11 and 12 do the same for the componentrelation , but they also change the this reference into this.inheritanceName.Finally, the τ function in rules 13 and 14, substitutes the component parameters.If the replacement is a component parameter, the @ symbol must be kept. If thevalue is the name of an actual component relation, it is replaced by a dot.

Figure 3.27 shows the method function, which is used to find a method, given itsname and the static (T) and actual (C) types of the target. Rule 15 covers the casewhere the name of the requested method is in methods(T). Rule 16 covers method


STj = subtype T(δ) [o=p]

E = class C(α) ST SCF K M

τst(δ, o=p, β, C, D N) = C A ml...

class T<β> ...

N = A n(A x)return e;

D N ∈ methods(T)(17)

C Ml overrides D N

δTi

BD N overrides E O

C M overrides D N ∨ C M same as D N (18)C M overrides E O

SCj = component T(δ) i [o=p]

E = class C (α)ST SCF K M

τco(T, δ, i, o=p, β, C, D N) = C A ml...

class T<β> ...



C Ml overrides D A i.n(A x)return e;

Ti

B

C M overrides D N D N same as E O (20)C M overrides E O

M = B m(B x)return e;

∀ D N=A n(A y)... : C M overrides D N ⇒ (B <: A ∧ B = A)(21)

C M OVERRIDE OK


E = class C(α) ST SCF K M m ∈ M(22)

B m(B x) return e; overridden in E

Figure 3.28: Method overriding.


X (23)C M same as C M

C F same as D G D G same as E H (24)C F same as E H



τst(δ, [o=p], β, C, D N) = C A h(A x)return e;

class T(β) ...

N = A n(A x)return e;)

D N ∈ methods(T)

¬h overridden in E (25)C A h(A x)return e; same as D N



τco(T, δ, i, [o=p], β, C, D N) = C A h(A x)return g;

class T(β) ...


D N ∈ methods(T)

¬h overridden in E (26)C A h(A x)return g; same as D A i.n(A x)return e;

Figure 3.29: Method equivalence.


(T-var)∆; Γ ⊢ x : Γ(x)

∆; Γ ⊢ e0 : C0 field(f, T, C0) = D g . 6∈ f(T-Field)

∆; Γ ⊢ e0 : T.f : D

∆(α) = S → C T <: S(T-Param)

∆;Γ ⊢ e0 : T@α : C

component(i, T) = component D(δ) j [n=p](T-Comp)

∆; Γ ⊢ e0 : T.i : D

∆;Γ ⊢ e0 : D D <: C(T-UCast)

∆; Γ ⊢ (C)e0 : C

∆;Γ ⊢ e0 : C0 mtype(m, T, C0) = D → C ∆;Γ ⊢ e : C C <: D . 6∈ m(T-Invk)

∆; Γ ⊢ e0 : T.m(e) : C

¬C abstract fields(C) = T D f ∆;Γ ⊢ e : C C <: D(T-New)

∆; Γ ⊢ new C(e) : C

∆;Γ ⊢ e0 : D C <: D C 6= D(T-DCast)

∆; Γ ⊢ (C)e0 : C

∆;Γ ⊢ e0 : D C 6<: D D 6<: C stupid warning(T-SCast)

∆; Γ ⊢ (C)e0 : C

Figure 3.30: Expression typing.

field(f, T, C) = D g fields(C) = U D g g = gi (R − Field)new C(e) : T.f → ei

C <: D(R − Cast)

(D)(new C(e)) → new C(e)

mbody(m, T, C) = x.e0

(R − Invk)new C(e):T.m(d) → [d/x, new C(e)/this]e0

Figure 3.31: Computation rules.


e0 → e′0 (RC − Field)e0 : T.f → e′0 : T.f

ei → e′i (RC − Invk − Arg)e0 : T.m(...,ei, . . .) → e0 : T.m(...,e′

i,...)

e0 → e′0 (RC − Invk − Recv)e0 : T.m(e) → e′0 : T.m(e)

e0 → e′0 (RC − Cast)(C)e0 → (C)e′0

ei → e′i (RC − New − Arg)new C(...,ei,...) → new C(...,e′i,...)

Figure 3.32: Congruence rules.

that are inherited directly, but are accessed indirectly. They are not present directlyin methods(T), but there is a trail of overrides and same as relations between therequested method and a method in methods(T). Renaming of component relationsis taken into account by looking up the actual relation, which may have a differentname than head.

The overrides relation is shown in Figure 3.28. Rule 17 describes the standardoverriding relation for subtyping. Rule 19 for the component relation is similar, butit inserts the name of the component relation before the name of the overriddenmethod to distinguish it from other inheritance instances of the same method. Inaddition, it introduces indirectly inherited methods in the inheriting class. If themethod is inherited directly, the indirect version can still be used, and is dynami-cally bound because of the overrides relation. Rules 18 and 20 make the overridesrelation transitive, and take the same as relation into account. Conformance ofmethods is enforced by rule 21, and rule 22 determines if a method is overriddenin a class.

The same as relation is shown in Figure 3.29. Rules 23 and 24 state that thesame as relation is reflexive and transitive. Rules 25 and 26 state that a renamedmethod is the same as the method with the previous name. As for overriding, thename of the component relation is added for methods renamed in a componentrelation.

3.6.2 Expression Typing

The expression typing is very similar to that of Featherweight Java, and is shown inFigure 3.30. The underlined expressions e : T are the result of the type elaboration.In such an expression, T is the static type of e. The ∆ environment is used forthe lookup of component parameters, and is similar to the environment used inFeatherweightGJ for the lookup of type parameters. Similar to methods, instance


variables are represented by P T f where P is the parent class, T is the type, andf is the name of the instance variable.

Invocations of the field, fields, and mtype function have been modified to takethe static type of the target into account. To avoid ambiguities, rules T−Field andT − invk are changed to require that the name of the instance variable or methoddoes not contain a dot such that the rules do not apply to indirectly inheritedmethods and instance variables. They are dealt with by the T − Comp rule. Asa result, only one rule is applicable for any expression. Rule T − new does nothave to be modified because classes with component parameters are abstract inthe formal model.

Rules T − param and T − comp are new. Component parameters are typed byT − param, and component references and indirectly inherited methods are typedby T − Comp.

3.6.3 Reduction Rules

Figure 3.31 shows the computation rules, while Figure 3.32 shows the congruencerules. Aside from the type elaboration, and the extra argument for the static typein mbody, field, and fields, the rules are the same as in Featherweight Java.

Note that there are no reduction rules for component references and compo-nent parameters. Component parameters cannot occur during the evaluation ofa program because they are substituted at compile-time. That also means thatthe ∆ environment is not needed. A separate rule for component references is notrequired because indirectly accessed methods and instance variables are treated asmethod and instance variables with one or more dots in their name.

3.6.4 Proof Of Type Soundness

The proof of type soundness is based on that of Featherweight Java. We provedthat our inheritance mechanism satisfies all assumptions that the proof makesabout the inheritance mechanism. This way, we could reuse most of the proof. Inthis section, we briefly present the most interesting lemmas and theorems.

Lemmas 3.6.1 and 3.6.2 state that the transformations of the method bodiesperformed during method lookup are type safe.

Lemma 3.6.1 The transformation of the method body of a method inherited viaa subtyping relation is type safe.

class T(α) ... ∧α : B → D, x : X, this : T ⊢ e : E ∧ δ ≤ α ∧

class S(β) ... subtype T(δ) ...

⇓x : X, this : S ⊢ τ(δ, α, e) : F ∧ F <: E.


Lemma 3.6.2 The transformation of the method body of a method inherited viaa component relation is type safe.

class T(α) ... ∧γ : G, x : X, this : T ⊢ e : E ∧ δ ≤ α ∧

class S(β) ... component T(δ) i ...

⇓x : X, this : S ⊢ [this:S.i/this:T]τ(δ, α, e) : F ∧

F <: E.

Theorems 3.6.3, 3.6.4, and 3.6.5 are the subject reduction, progress, and typesoundness theorems. The difference with the corresponding theorems of Feather-weight Java is the use of an elaborated program.

Theorem 3.6.3 (Subject Reduction) For a well-typed expression e of an elab-orated program:If Γ ⊢ e : C and e → e′, then Γ ⊢ e′ : C′ for some C′ <: C

Theorem 3.6.4 (Progress) Suppose e is a well-typed expression in the evalua-tion of an elaborated program.

1. If e includes new C0(e):T.f as a subexpression, then fields(T) = P C f andf ∈ f for some P, C, and f.

2. If e includes new C0(e):T.m(d) as a subexpression, then mbody(m, T, C0) =x.e0 and #(x) = #(d) for some x and e0.

Theorem 3.6.5 (Type Soundness) Suppose e is an expression of a well-typedelaborated program:

If ∅ ⊢ e : C and e →∗ e′ with e′ in normal form, then e′ is either a value v with∅ ⊢ v : D and D <: C, or an expression containing (D)new C(e) where not C <: D.

3.7 Related Work

In [OZ05], Odersky and Zenger identify three scalable component abstractions forremoving hard references from components to increase their reusability: abstracttype members, selftype annotations, and modular mixin composition. Abstract typemembers and selftypes specify the required services of a component, and mixinsperform the composition. But while these abstractions are scalable with respectto the size of the components, they are not scalable in the way components areused. The problem is that both selftypes, and mixins as used in Scala, prohibitany composition involving multiple components of the same kind, or componentscontaining features with the same name. Despite the claim that these abstractionscan lift an arbitrary assembly of static program parts to a component system, they

3.7 Related Work 93

fail for our simple example application, which is little more than an assembly offour kinds of static program parts. The authors argue that nesting of classes isessential because otherwise, the amount of wiring would become substantial. Thiscontradicts our findings. In this chapter, we built an application using componentswithout using nested classes. So while nested classes may provide benefits, they arenot a requirement for component composition. In both approaches, componentsare classes, and the result of the composition of components is again a class.

In [vDS05a], we introduced anchored exception declarations to remove hardreferences from the exceptional specification of a component. They allow the ex-ceptional specification of a method to be declared relative to other methods. Thisincreases both the adaptability and reusability of code using checked exceptions.More specifically, they simplify the reuse of higher-order functions by taking theexceptional behavior of the actual function parameter into account instead of pro-viding an inflexible upper bound that forces the programmer to write many incon-venient and dangerous error handlers.

CaesarJ [MO02], like Scala, uses mixin inheritance for component composition.The authors introduce collaboration interface to declare the provided and requiredmethods of the different components. The components that provide the requiredmethods are then combined using mixin inheritance. Virtual classes and nesting ofinterfaces are used to define the different parts in the composition. The containerof the parts represents the composition. CaesarJ components cannot be used asabstract data type components because of the lack of repeated inheritance and aconflict resolution mechanism. Conflicts are avoided using the nested interfaces,but they do not contribute to the abstract data type of the composition itself.CaesarJ also offers features of aspect-oriented programming which are not providedin our approach.

ArchJava [ACN02] uses ports to connect components. A component declaresthe methods provided and required by a port. Composition of components is doneby connecting ports to each other. The difference with our approach is that inArchJava, ports are used to enforce communication constraints, while componentrelations are used to compose an ADT from other ADTs. In our approach, a portcorresponds to a regular class which is then used in a component using a compo-nent relation that inherits all features indirectly. The class representing the portcan declare its requirements using instance variables, abstract getter methods, orcomponent parameters. So as a composition mechanism, our inheritance mech-anism is more flexible, but it does not enforce communication constraints. Ourmechanism for preventing component references – using component classes – issimpler and more effective than that of ArchJava. In ArchJava, this is accom-plished by prohibiting the use of the type of a component in the ports and publicinterfaces of a component type, and types of instance variables. In addition, anexception is thrown if a cast to a component type is performed. We can preventthe use of this, and names of component relations as a separate expressions for


component CTodoModular is compose provides IQueue q;provides ISupervisor su;intro CTodoExtension toDo;intro CList list;plug list.list into toDo.list;plug todo.q into q;plug toDo.su into su;

Figure 3.33: The ComponentJ version of CTodoModular.

class CTodoModularcomponent CTodoExtension<CTodoModular> (list) [directq,su]component CList list

Figure 3.34: Our version of CTodoModular.

component classes.

ComponentJ [SC00] only uses ports for composition of components, and thus isless flexible than our approach. In addition, it is more verbose, as shown by Figures3.33 and 3.34. In Figure 3.34, the CTodoExtension component is connected to theCList component using a component parameter. Another alternative is to use aninstance variable or an abstract getter method.

In [BW05], the authors present a language construct for first-class full-blown re-lationships. Such a language construct is also advocated by Rumbaugh in [Rum87].With our inheritance mechanism, this language construct can be replaced by com-ponent classes. In this chapter, we used relationships without attributes, buta component for full-blown relationships can be built on top of them. An ex-ample implementation is given in Figure 3.35. A PassiveBidiAssocation is aBidiAssociation that cannot be modified from that end of the association. Thisis necessary to ensure consistency of the relationship.

In [PN06], support for relationships is provided using aspect-oriented program-ming. The authors offer a library of relationship aspects, which are similar to ourassociation components. These relationships are then inserted into the applicationby defining a point-cut for each relationship in the model. Support for static re-lationships – relationships that are part of the participating classes – is limitedbecause of limitations in AspectJ [KHH+01]. The static relationship aspects addfields to the participating classes, which leads to name clashes in case of multiplerelationships in one class. As such, AspectJ cannot currently be used to create

3.7 Related Work 95

public abstract class Relationship<FROM,TO,KIND extends Relationship<FROM,TO,KIND>>[FromName,ToName](FROM → PassiveBidiAssociation<FROM,KIND> from)(TO → PassiveBidiAssociation<TO,KIND> to)

component BidiAssociation-1-Side<KIND,FROM>(from) [X=%FromName%,

override unregister%FromName%]export private set%FromName%,

register%FromName%] component BidiAssociation-1-Side<KIND,TO>

(to) [X=%ToName%,override unregister%ToName%,export private set%ToName%,register%ToName%]

public Relationship(from,to) setFromName(from)setToName(to)

protected void unregister%FromName%(FROM from)

super(from);setToName(null);

protected void unregister%ToName%(TO to)

super(to);setFromName(null);

. . .

public class Attendssubtype Relationship<Student,Course,Attends> (courses,students)

[FromName=Student,ToName=Course]component UniAssociation<int> [X=Mark] . . .

public class Studentcomponent PassiveBidiAssociation-N-Side<Student,Attends> courses. . .

public class Coursecomponent PassiveBidiAssociation-N-Side<Course,Attends> students. . .

Figure 3.35: Full-blown relationships.


abstract data type components. An advantage of this approach would be thatcomponents could be added externally, keeping the original classes smaller. Thateffect, however, can also be achieved using any form of higher order hierarchiesas in [OH92, Ost02, Ern03, NCM04, OZ05]. The aspect-oriented approach woulduse one point-cut for each component, where it is configured and added to a class.So the real power of aspect-oriented programming is not needed, and a simplerapproach like ours is preferable.

In the 1997 version of Eiffel [Mey97], the inheritance relation is used both forclassification and code reuse. It is possible to duplicate features when inheritingmore than once from the same class, which is confusing for classification purposesas argued in Section 3.4.2. The resulting diamond problem for repeated inheri-tance is often considered to make the language more difficult [BC90, Sak89]. Inaddition, a subclass can use covariant argument types for a method, or even re-move features, which makes a whole program analysis required to ensure typesafety. In SmartEiffel 2.2 [CRA+05] and the new Eiffel specification [oTC05], theinheritance mechanism has been extended with non-conforming inheritance. InSmartEiffel 2.2, duplication of features and narrowing their visibility is no longerpermitted. Using covariant argument types, however, remains possible. SmartEif-fel forbids component references by type-checking the code of an inserted class inthe context of the inheriting class, but this violates modularity. Because sharingis the default policy for the insert relation, accidental merging of components ispossible. Our work can mainly be thought of as an extension of the SmartEiffelinheritance mechanism to allow convenient composition of abstract data types.

Sather [SOM93] and Timor [KHM04] separate types and classes, and the re-lations between them. Types can inherit from multiple other types, and classescan include other classes for code inheritance. In addition, classes can implementtypes. We do not favor the mandatory separation of types and classes since italways requires the programmer to use a heavyweight solution.

Timor has support for named subtyping relations [KHM04] to support re-peated inheritance. We think this is a bad idea because it does not fit in theclassification metaphor. We think it is very confusing for an object to be 1.9 timesa CassettePlayer as in their example. They also use the inheritance names todisambiguate conflicting names, but for reusing characteristics this approach isnot practical. A severe problem with their mechanism is that name conflicts areautomatically resolved by removing direct access to the involved methods. As aresult, adding a subtyping relation, or even adding a method to an inherited typecan break existing clients without even a warning because conflicts can be intro-duced. The names of subtyping relations can be used as component references.Timor also has support for reuse variables. Features of the classes referenced bysuch variables are inherited if they are needed for the types implemented by theclass. If they are not inherited, however, they are not available to clients sincethey are not part of the types via which the class can be used. The mechanism

3.7 Related Work 97

can be seen as delegation-by-value. Reuse variables also reduce the dependency ofthe implementation of a class on its hierarchy, but the authors do not present thisinsight.

Traits [SDNB03] not only use a separate relation for code inheritance, butalso a separate concept – a trait – for a set of methods that can be reused viacode inheritance. Unlike traits, we do not have a separate concept to represent acomponent, it is just a class. If the component relation could only be used withspecial building blocks, unanticipated reuse would be impossible. On top of that,programmers must deal with an extra concept which is just a degenerate abstractclass. Another motivation for our choice is the possibility to instantiate character-istics. We see no reason to forbid a programmer to create an object that representsa bounded value. In addition, classification of characteristics is necessary. To reusealmost any kind of association, it is necessary to create a hierarchy of associationclasses. The relation between classes capturing choices like mutability, arity, . . .and class Association is a subtyping relation, not just a code inheritance rela-tion. Methods inherited via traits automatically override methods inherited fromclasses although there is no relation between them. This form of structural sub-typing can lead to bugs that are hard to find. In addition, dependencies of traitsmust be resolved individually, and repeated trait-inheritance is not possible. Assuch, traits allow far less code reuse than our inheritance mechanism. In [BSD03],traits are used to refactor the Smalltalk collection classes. The authors report a12% reduction in code size.

In [RT06], Reppy and Turon present trait-based metaprogramming. They addrenaming and hiding to traits to allow using a trait more than once in a class.Similar to Eiffel, name conflicts and dependencies must be resolved one at a time.Because traits cannot contain state, the overhead is larger than in Eiffel.

Languages like CLOS [DG87], most mixin-based [BC90] languages like Scala[OZ05], and many others use linearized multiple inheritance. The linearization ofthe class hierarchy, however, complicates its use [Sny86, CUCH91, SDNB03]. It isnot possible to determine the meaning of a single inheritance relation of a classwithout looking at the others because some of its methods may be overridden bymethods of other classes that happen to have the same name. This makes it easyfor methods to be overridden by accident [Sny86]. Repeated inheritance, whichis an essential language construct for composition of ADTs is impossible in theselanguages. The abstract super class of a mixin, however, allows for reusable re-finements, which cannot easily be created using our approach. More research isrequired to decide how abstract super classes should be integrated in our inheri-tance mechanism.

Cecil [Cha04] supports multiple inheritance. Repeated inheritance, however,is forbidden, and name conflicts result in compilation errors. The language usesproperties for instance variables, making it possible to override them. Subtypingand code inheritance relations can be used both separately or combined.


In Self [CUCH91], inheritance relations are given a priority. For relations withidentical priorities, name conflicts result in an error. For relations with differentpriorities, conflicts are resolved automatically by inheriting the feature of the rela-tion with the highest priority. The Sender Path Tiebreaker Rule resolves additionalconflicts by giving priority to methods within the same inheritance path in caseof ambiguities. Renaming is not supported. Directed resends do not increase thedependency between the implementation and the inheritance hierarchy becausethey are sent to named slots, which is very similar to using named inheritancerelations.

C++ [Str91] has limited support for repeated inheritance. A class cannot in-herit from the same base class more than once, making it unsuitable for buildingclasses from components. In addition, it has no support for renaming, forcingclients to resolve name conflicts. The language supports separation of subtypingand code inheritance through public and private inheritance.

The Sina/ST language [AT88] offers an interface predicate to determine howcalls to an object are dispatched. The type can dispatch calls to the current objector to an object declared in its interface. The predicates (target.method(args))are matched from left to right. If a call matches the name and argument types ofa predicate, it is dispatched to its target. The predicate, while providing a lot offlexibility, also brings with it a lot of complexity. Dynamic binding, however, mustexplicitly be designed in the super class using a server call.

There are several mechanisms for building hierarchies of inheritance hierarchies[OH92, Ost02, Ern03, NCM04, OZ05]. In these approaches, a hierarchy of classescan be extended by extending the existing classes and introducing new classes.With hierarchy inheritance, extensions to a class of a hierarchy are visible to allother classes of the hierarchy. This approach is complementary to ours: multipleinheritance cannot be used to achieve the benefits of hierarchy inheritance and viceversa. Adding such a mechanism to ours will result in an even higher reusabilityof code.

In Jigsaw [BL92], inheritance is presented as an operation on modules. Theauthors define a number of basic operators to model multiple inheritance, mix-ins, instantiation, and other techniques. Contrary to their approach, we definetwo highly specialized operators to match the classification and building blockmetaphors. To model our inheritance mechanism in Jigsaw, operators must beadded to model e.g. indirect inheritance and component parameters. Another dif-ference is that their approach is mainly technical, while ours is more focused onmethodology by focussing on easy to understand metaphors.

In [Sak89], Sakkinen argues that with the possibility of sharing or duplicatingstate, it is possible that dependent state variables are split because they are notall shared or duplicated. Data groups [Lei98] or method groups [SG95] can beused to prevent this problem. All state within a group should either be shared orduplicated. Integration of this functionality in our inheritance mechanism remains

3.7 Related Work 99

future work.

In [THA05], Tobin-Hochstadt and Allen present a calculus of metaclasses whichallows them to build arbitrary hierarchies with instance of relations. This allowsthem to capture aspects of the world that cannot otherwise be expressed [WF94a].In addition to the extends relation for subtyping and code inheritance, they use akind relation to declare that a class is an instance of another class.

The Java Syntactic Extender [BP01] is a complete macro system for Java[G+00]. Our renaming parameters provide only an extremely basic macro system,which can only be used to conveniently rename features of a class.

Several design patterns can benefit from using our component relation. Pat-terns that require the introduction of certain methods can benefit from definingan appropriate component for each of the participants. Examples are composite,singleton, observer, and memento. Frequently used template method patterns, suchas patterns for caching and locking, can also be captured in a component for reuse.The visitor and iterator patterns benefit directly from the use of association orrelationship components. The state and adapter patterns cannot currently benefitfrom our approach because the component would have to be interchangeable atrun-time.

Darwin and Lava [Kni99a] by Kniesel can deal with dynamically interchange-able components at run-time, and can thus be used to implement the state, adapter,and decorators patterns. But since the inheritance mechanism does not supportrepeated inheritance, it cannot be used for abstract data type components. Likeour approach, Darwin inheritance is a combination of delegation and inheritance.Integration of the dynamic inheritance of Darwin into our inheritance mechanismremains future work.

A split object [BD96] consists of a collection of pieces. Pieces represent par-ticular viewpoints or roles of the split object, and are organized in a delegationhierarchy. Unlike the split object, however, pieces have no identity. Invoking meth-ods is done by selecting a viewpoint to send the message to. The main differencewith our approach is that component relations are typically used to build a singleADT, whereas pieces are used to model different viewpoints on an object. Thisdifference in purpose results in additional technical differences. The hierarchies ofboth approaches have an opposite order with respect to overriding. For pieces, theleaves are the most specific parts, whereas for component relations, the root – thecomposition – is the most specific part. In addition, features in pieces cannot bemerged, whereas features inherited through different component relations can bemerged. Finally, pieces are added dynamically, whereas component relations aredeclared statically.


3.8 Conclusion

We have shown that current object-oriented programming languages do not offerthe abstraction level required to use general purpose classes as building blocksfor other classes in a practical manner. This prevents a developer from reusinghigh-level concepts like associations, bounded values, graphs, . . . .

We showed which features are necessary to encapsulate and reuse such concepts,categorized them, and showed how current reuse mechanisms support them. Wethen integrated those features in a new inheritance mechanism.

Our inheritance mechanism is the first to make this kind of reuse practical. Byusing renaming parameters and making component relations first-class citizens,we eliminate the problems encountered with existing mechanisms. They allow aprogrammer to easily exploit name patterns, connect components, provide botha simple class interface and lots of functionality, and use components as if theywere separate objects. Together, these improvements raise the abstraction level ofthe programming language, since it is no longer required to create a new languageconstruct or write lots of low-level code to reuse a high-level characteristic. Ofcourse, the component relation can also be used for traditional code inheritanceas used in traits, Eiffel, SmartEiffel, Cecil, . . . .

The case study confirms that our inheritance mechanism yields much betterresults (21% to 36% reduction) than other inheritance mechanisms (3% to 12%reduction), and delegation (11% to 17% reduction). It also shows that our in-heritance mechanism is more robust with respect to extensions of components.In addition, it is still beneficial to reuse small components, or small parts of bigcomponents with our inheritance mechanism, contrary to the other techniques.

Part III


Development

101

Chapter 4


Development

If a man will begin with certainties, he will end in doubts; but ifhe will be content to begin with doubts, he will end in certainties.

Francis Bacon “Advancement of Learning”

4.1 Introduction

Programming tools, such as compilers, code formatters, and refactoring tools. . . typically operate on an abstract syntax tree (AST) of the program. First, alexer transforms the source code of the program into token streams. Then, a parseradds structure by turning these streams into ASTs. Finally, the programming tooloperates on the ASTs to perform its task. This process is illustrated in Figure 4.1.

The lexer and parser are typically generated by parser generators such asANTLR [PQ95], Rats! [Gri06], Elkhound [MN06], PPG [Gri06], JavaCC [SVD96]. . . . They take a grammar file as input, and produce a lexer1 and a parser for thelanguage described by that file. To prevent redundancy in the grammar files, mostof the current parser generators allow parts of a grammar to be reused in othergrammars.

Up to this level, reuse is well supported in traditional systems, but once the ab-stract syntax tree is generated, the programming tools themselves are implementedfrom scratch. And since a programming tool must also know at least part of thelanguage rules that apply to the nodes in the abstract syntax tree, this means that

1Some parser generators, such as Rats! do not produce a separate lexer.

103

104 Programming Language Development

Figure 4.1: Traditional architecture for programming tools.

4.1 Introduction 105

at least a part of the semantics of the programming language is duplicated andincorporated in these tools.

Duplication the language semantics in every tool costs a lot of time and money,and leads to bugs in the implementations. For example, querying the Eclipse[GHM+05] and IntelliJ [Jet07] bug databases reveals a lot of bugs related to thelanguage semantics of Java. If they could have reused the language semantics,these bugs would have been avoided.

By incorporating the language semantics into a programming tool, that tool istied to a particular language even if that is not necessary. As a result, it takes a lotof effort to create a set of tools for a new programming language, and programmersdemand such tools because they exist for competitive programming languages. Assuch, the language developer is faced with a choice: either create the tools or have alanguage that almost nobody will use. For example the developers of JML [LBR00]and Scala [OZ05] must do a lot of work to create plugins for Eclipse in order toattract users.

We experienced the problems mentioned above ourselves while implementingthe language constructs presented in Chapters 2 and 3 of this thesis. After wehad written a compiler for Cappuccino to extend Java with anchored exceptiondeclarations, we noticed we could no longer use our usual programming tools withour new language. In case of Cappuccino, the difference with Java is very small,yet none of the existing tools could be used anymore. For the compiler for theinheritance mechanism of Chapter 3, we tried to use the traditional approach ofputting the language semantics in the compiler instead of the data model of thesource code. This lead to an unnatural non-object-oriented programming style,making it very difficult to write the compiler. In fact, this made it so difficult thatthe compiler still is not finished.

To address these problems, we started to develop an new approach for pro-gramming language development. The central element in this approach is theChameleon framework, which is a framework for metamodels of programming lan-guages. This framework presents an abstract view on different families of progam-ming languages, hiding the details of specific programming languages. As a result,programming tools can be written independent of a specific programming lan-guage. To add support for a particular programming language, a language modulemust be created. The language modules take care of the specific details for aprogramming language.

The approach we present in this chapter is not a final solution for programminglanguage development, but the current state of an ongoing project. There are stillmany areas to explore in terms of the diversity of supported language constructsand in terms of additional support to facilitate the creation of programming tools.The results so far, however, are very promising. We have concrete language modulesfor Java and C#, and a variety of tools including an advanced code editor anda basic CASE tool. Due to the enormous amount of implementation work, the


designs presented in this chapter are not always completely reflected in the actualimplementations. There is an ever increasing need for additional abstractions, andthe designs in this chapter are based on our current experience in the project.

Modern Approaches

We now briefly discuss some modern approaches to solve these problems. Theseare more detailed discussion is presentend in Section 4.6.

Tools like Polyglot [NCM03], JaCo [ZO01a], and JastAdd [HM03], enable theconstruction of extensible compilers. Language extensions can be built by extend-ing an existing compiler, and overriding the behavior for specific nodes in the AST.But the compilers and their type checkers cannot be used to make programmingtools independent of the programming language because they do not offer a uni-fied interface to the language semantics. Their techniques for providing extensiblevisitors, however, remain useful in our approach.

The Meta Object Facility (MOF) [OMG06] is a standard for managing meta-data, and is mainly used for managing metamodels of modeling languages, suchas UML, and models that are instances of these metamodels. One of its goals is tooffer a unified view on modeling languages. The language to define metamodels forthese modeling languages, however, is not powerful enough to create frameworksof metamodels. As a result, there is only one unified view, which is far too ab-stract to create advanced tools. In addition, not all of the semantics of a modelinglanguage ends up in the implementation of that language, as tools for generatingimplementations from the specifications are limited.

Overview

In Section 4.2, we give an overview of the architecture of Chameleon. We thendiscuss one of our tools in Section 4.3 to present the requirements that tools puton the framework. In Section 4.4, we discuss the creation of language modulesalong with their requirements. We present the Chameleon framework in Section4.5. We discuss related work in Section 4.6, and we conclude in Section 4.7.

4.2 The Architecture of Chameleon

Chameleon is a framework for metamodels of programming languages. It acts as abottleneck interface between programming tools and programming languages. Theabstractions provided by the framework allow programming tools to do their jobwithout limiting themselves to a fixed set of languages. By implementing the codeof the framework and the metamodels, we ensure that all semantics are availablein the implementations.

4.2 The Architecture of Chameleon 107

In this section we present an overview of the architecture of Chameleon. Webriefly present the different stakeholders and what the responsibilities and require-ments are. These are then presented in more detail in the following sections.

Figure 4.2 shows the architecture of Chameleon. The striped arrows mean thatthe element at the source of the arrow uses the element at the target of the arrow.The continuous arrows denote implements relations. The boxes with sharp cornersrepresent software modules, while the boxes with rounded corners serve only togroup similar elements.

There are four stakeholders involved in the architecture: the tool developer, thelanguage developer, the framework developer, and the extension developer.

The tool developer develops programming tools that, if possible, are indepen-dent of the programming language used by clients of these tools. Examples arecode editors, CASE tools, refactoring tools, documentation generators, and analy-sis tools. Because of the enormous difficulty of his job, the tool developer has morerequirements than the other parties. On the one hand he needs a model that isabstract enough to cover the widest possible range of programming languages. Onthe other hand, the model must be sufficiently concrete to be able to offer advancedfunctionality. Other requirements include support for all language semantics, theability to attach additional metadata to the model, and input and output modules.If the tool developer needs functionality that is not available in the framework, hemust create a tool extension rather than hardcoding the functionality for a fixedset of languages. A tool extension provides a general interface that must be im-plemented to add support for a programming language to the tool. It can includeclasses and interfaces, but also configuration files. In the ideal case, the frameworkprovides all required functionality, and no tool extension is needed.

The framework developer is the key to success. He provides the bottleneckinterface between the programming tools and the programming languages. Hisresponsibilities are providing abstractions that allow the tools to do their job,and giving the language developer as much freedom as possible to implement newlanguages. As such, his biggest challenge is to choose the right level of abstraction.A higher level of abstraction allows more languages to be modeled, but can makeit more difficult or impossible to write certain tools, as less information is exposed.To maintain a good balance, the framework developer must regularly collaboratewith tool developers and language developers.

The language developer is responsible for implementing a language module. Hemust implement the input and output modules, language constructs that are notsupported by the framework, and all the semantic rules of the language. To createnew language constructs, he needs abstractions that are general enough to coverthe widest possible range of language constructs. In addition, he needs support toquickly experiment with new language constructs and languages.

The extension developer is responsible for implementing the requirements ofa tool that are not provided by the framework for one programming language.


Figure 4.2: The architecture of Chameleon.

4.3 Programming Tools 109

This way, he adds support for that language to the tool. In practice, the extensiondeveloper will often be a tool developer or a language developer.

It is important to note that the framework presented in the remainder of thischapter is not complete, and some of the designs have not yet been implemented inthe way they are presented. Currently, the object-oriented part of the frameworkis implemented, as that is our main interest.

4.3 Programming Tools

To better understand the functionality provided by the framework and the lan-guage modules, we start with the point of view of the tool developer. We will studythe requirement on the framework and the language modules using our languageindependent code editor as an example.

4.3.1 Code Editor

The most mature tool we currently have is a code editor for the Eclipse IDE[GHM+05]. The editor is independent of the used programming language, andsupports syntax highlighting, code folding, an outline view2, hyperlinks, projectmanagement, reporting syntax errors, and dynamic loading of language modules.In the rest this section we will discuss the most important techniques to achievethis independence.

Figure 4.3 shows an overview of the Eclipse plugin. The editor in the userinterface is somehow connected to a number of projects – the details are unim-portant. A project can be customized by optionally attaching a project natureto it. Its interface only exposes a connection with the project. The data modelof the program in the project – its source code – is connected to our customChameleonProjectNature, which acts as an entry point for the model. If for ex-ample the editor or the outline view need access to the current compilation unit,they obtain it through the project nature.

Because the editor – or any other tool – needs an entry point to the model,there is a need for a unified entry point in the framework that is used by allsupported languages. As most programming languages use a namespace as thehighest hierarchical element, that concept is chosen to be the entry point. Thisreveals a requirement on the framework that we will also encounter further on.Tools need a minimal set of structural elements, such as namespaces and files, toperform their job. Therefore, the framework must provide a unified structure forconcepts that are present in most programming languages.

The plugin also defines an interface for its tool extension, which allows theplugin to query the name and the version of the language module. In addition, the

2An outline view is a tree view of all the most important elements of a file such as classes,methods, and fields.


plugin defines an xml schema for configuring the syntax highlighting and outlineview. To add support for a specific programming language, the tool extension mustbe implemented, and a configuration file must be provided. The name of the toolextension class is predefined, contrary to that in the figure, and the name of itspackage must be the first part of the jar file containing the extension and thelanguage module.

Synchronization of the Data Models

The data model of the editor and the Chameleon model are synchronized. Changesin the editor are reflected in the Chameleon model, and changes in the Chameleonmodel, for example due to refactorings, are reflected in the editor.

The editor is connected to the Chameleon model in two ways as illustratedin Figure 4.4. First, the document of the editor, which is the data model for theuser interface, is connected to the compilation unit that is edited. This connectionserves as an entry point to the model. Second, individual elements in the modelare connected to their textual representation in the editor.

The link from the Chameleon model back to the data model of the editor revealsanother requirement. Because the framework may not have any dependencies onspecific tools, a tool must be able to attach additional metadata to elements ofthe Chameleon model.

In Eclipse, Positions can be attached to a document. A Position represents atext range, and is managed by the editor. If text is added or removed, all positionsthat overlap the edited region are updated. In addition, all positions whose textis completely removed are deleted. An object of the subclass ChameleonPositionconnects a text range to an element in the model. The position tag is registeredunder a predefined name to distinguish it from other kinds of positions, such asthe ones used later on for supporting hyperlinks.

It is important to note that the Chameleon model is the data model of theapplication. The data model of the editor is only used to let the text editor doits job. Other functions such as the outline view, refactorings, and even syntaxhighlighting, directly use the Chameleon model.

A reconciler is used to propagate changes from the editor to the model. Recon-cilers are supported by Eclipse, and are basically strategy objects that are used toprocess the document at certain times depending on the used policy – for example,two seconds after the last edit. They take care of tasks like updating the syntaxhighlighting and code folding. The ChameleonReconciler records all edits, andupdates the model by reparsing the modified parts.

Propagating changes from the model to the editor, puts additional requirementson Chameleon. First, the models must send events when they are updated suchthat tools can update their views. Second, the editor needs an the output moduleto obtain a textual representation of the updated model, and replace the currenttextual representation in the document.


Figure 4.3: Overview of the Eclipse editor plugin.


Figure 4.4: Design of connection between the editor and the metamodel.

Figure 4.5: Example connection between the editor and the metamodel.


Syntax Highlighting

Syntax highlighting is not done using the data model of the text editor. Eclipseoffers support for custom syntax highlighting, but it uses grammar based rulesto determine the highlighting, and uses the text document as its data model.This means that it is not possible to use the full semantics of the model in thehighlighting. For example, it is not possible to create a highlighting that usesdifferent styles for value types and reference types.

Because syntax highlighting is performed by a reconciler, it is possible to re-place it entirely with another reconciler. The Chameleon presentation reconcilerperforms highlighting based on rules which are provided by the language specifictool extensions. Currently, the implemented style rules can only match the typeof the element in the model, and the names of its tags, but it is easy to supportmore complex style rules like the one mentioned above.

Hyperlinks

To help a programmer navigate through the source code, cross-references can beturned into hyperlinks that point to the definition of the referenced element. Toallow this functionality to be implemented independent from a particular language,two requirements must be satisfied so the editor can detect cross-references andlocate the definition.

First, the Chameleon model must incorporate the semantics of the languageto determine which element the cross-reference refers to. To expose this func-tionality in a unified way, cross-references in a metamodel must implement theCrossReference interface. This interface offers a method to search the referencedelement. In practice, it will always delegate the call to another method, as thelookup of elements is a part of the semantics. For example, in Invocation, thecall is diverted to getMethod().

Second, cross-references in the model are marked with metadata by attach-ing a ChameleonPosition with a predefined name so the editor can detect them.When a language module is loaded, the editor attaches a metadata factory to theinput module. The input module then passes the created elements to all meta-data factories for attaching metadata. The factory determines if the element is across-reference by checking if its class implements CrossReference, and creates aChameleonPosition based on the lexical coordinates. The position marks the re-gion that must be clicked to follow the hyperlink. Note that this position does notnecessarily cover the entire region of the element in the source code. For example,for a method invocation, only the name of the method is turned into a hyperlink.

The editor detects hyperlink under the mouse cursor by looking for tags –which are ChameleonPositions – registered with the predefined name. By in-voking getReferencedElement, the editor obtains the definition of the referencedelement. It then queries it for the enclosing compilation unit, open the appropriate


Figure 4.6: Code completion in our code editor.

file, and jumps to the correct position in the document.

Code Completion

An important feature in modern IDEs is code completion. When a part of thename of an element is written, the programmer can invoke the code completionfeature to automatically complete the name, or offer a list of possible matches ifthere are multiple possible names. Figure 4.6 shows code completion at work inour code editor.

Just like the hyperlink feature, code completion depends on the lookup mecha-nism of Chameleon. But instead of finding an element whose name equals a givenstring, the lookup mechanism must now find all appropriate elements in scopewhose name starts with that string. Adding this feature to the editor showedagain that communication is required between framework developers and tool de-velopers. Initially, the lookup mechanism could only return exact matches. Butputting this functionality in a tool extension would have resulted in a disaster, asthe lookup mechanism for each language would have to be implemented again. Bymodifying the framework and allowing the lookup mechanism to be parameterizedwith a custom search criterion and stop condition, it can be reused to supportcode completion.

4.3.2 CASE Tool

In addition to the code editor, we are working on a CASE tool for Chameleon.It currently offers support for creating and manipulation class diagrams. The di-


agrams are not UML diagrams, but diagrams of the source code, as a Chameleonmodel is used as the underlying data model.

Because the tool is designed to create class diagrams of object-oriented pro-grams, it has knowledge about concepts such as classes, class members, and inheri-tance relations. So contrary to the code editor, which can work with any language,the CASE tool can only work with object-oriented programming languages. There-fore, Chameleon must provide different views for different families of programminglanguages, which all have their specific tools. This is discussed in more detail inSection 4.5.1.

4.3.3 Other Tools

The editor is not the only tool that was developed. There are some more toolsthat use the Chameleon framework, or the metamodel for Java.

JProver [Luy06, SV04] is a static program verifier for Java programs annotatedwith specifications. It proves if the implementation of a method fulfills its contract.Because of the complexity of the task, its verification possibilities are limited. Thetool does not work directly on an instance of a metamodel, but first translates it toan intermediate language which is more suitable for a theorem prover. Support foradditional language constructs, and thus programming languages, can be added byproviding translators for the specific elements of that language in a tool extension.

TestGen [Sch06] is a tool for generating unit test classes for Java based on thespecifications of classes and methods.

In [Van06], Java is extended with modular aspect-oriented programming fea-tures. Methods are composed of a begin, middle, and end block to enable themodification of methods by adding blocks. Method sets can be defined in a classto add such blocks to entire set of methods at once. A preprocessor that translatesthe programs to Java was developed using Chameleon.

The preprocessor for Cappuccino, which extends Java with anchored excep-tion declarations, is built by extending the metamodel for Java. After readinga programing, it transforms the resulting model by removing anchored exceptiondeclarations and adding exception handlers to catch exceptions that cannot be sig-nalled. Finally, it writes the program to Java files for compilation by the standardJava compiler.

Similarly, the preprocessor for Draco [VRUB03], which is a component frame-work, is also built on top of the Java metamodel. It transforms programs withlanguage constructs for supporting components to equivalent Java programs bygenerating the corresponding infrastructural code.

The DeepCompare [Van07] tool uses our Java metamodel as its datamodel.The tool finds similarities in the structure of two Java programs, and generates aconfiguration file that is used by a run-time system for state transfer.


Figure 4.7: Overview of an input module for Java.

4.4 Language Modules

To add support for a programming language, a language module must be created.A language module provides input and output modules, adds classes for languageconstructs that are not yet supported, and provides the required factories.

4.4.1 Input Module

As discussed in Section 4.3, a language module must provide an input module thatconstructs a metamodel from the source code of a program. The ModelFactory

class acts as a facade for the input process as shown in Figure 4.7 along with anexample input module for Java.

A model factory gives programming tools the functionality required to con-struct models from an input source in several ways. It can be used to create modelfrom a set of files, to add new files to a model, and to create a partial model froma piece of source code. The latter functionality is required to update models aftermodifications in the source code without having to process the entire compila-tion unit from scratch. A model factory usually uses generated lexers and parsersto create an abstract syntax tree, which is converted to a Chameleon model bytraversing over the tree. To hide the specific lexer and parser from the client, theirexceptions are converted into standard Chameleon exceptions.

4.4 Language Modules 117

Figure 4.8: Extension of an input module.

But converting source code to model elements is not enough to create a validmodel. If a language uses built-in types and operators, such as int and + in Java,they must be artificially added to the model as well. After all, there are no sourcefiles that can be used to create them since they are only described in the languagespecification. The model factory will add the required additional types to themodel, and populate them with the operators. In addition, operators such as the+ for string concatenation in Java are added to the appropriate existing classes.

Two techniques are used to make the input modules themselves more reusable:the grammar is reused, and the transformers are reused. This is especially impor-tant for the construction of language extensions. Take for example Cappuccino,which adds anchored exception declarations to Java. Creating a complete input


module from scratch would take a lot of effort while the difference with Java isvery small. The techniques are illustrated in Figure 4.8, which shows how the inputmodule of Cappuccino is built using the input module of Java.

By using a modern parser generator such as ANTLR, the grammar of theextended language can be constructed by extending the grammar of the baselanguage. In the case of Cappuccino, the rule for method definitions is overridden,and rules for exception clauses, exception declarations, and anchored exceptiondeclarations are added.

The transformers can be reused by splitting them in parts. Each transformeris responsible for transforming one category of elements, such as types, methods,statements, or expressions. A transformer factory is then used by both the modelfactory and the transformers themselves to obtain the various transformers.

For example, the TypeTransformer uses the factory to get aMethodTransformer instead of constructing one itself. The Cappuccino in-put module can now add support for anchored exception declarations byproviding a new MethodTransformer which overrides the method for trans-forming the exception clause. To enable the Java transformers to use the newMethodTransformer, a new factory is created and the method for providing amethod transformer is overridden.

The Cappuccino transformer factory also introduces a new factory method.The transformation of expressions in an anchored exception declaration is slightlydifferent from the transformation of an expression in the implementation. Be-cause type names can be used as expressions in anchored exception declarations,VariableOrTypeReferences must be added instead of VariableReferences. TheExcClauseExprTransformer therefore inherits from ExprTransformer, and over-rides the behavior for variable references. Note that this new expression trans-former is only used for transforming expressions in an anchored exception decla-ration. In all other cases the existing Java expression transformer is used.

Adding Metadata

As seen in Section 4.3.1, some tools must be able to add metadata to the elements inthe model. Currently, this is done by adding a metadata factory to the transformerfactory. The transformers, which are linked to the transformer factory, passes everycreated element to the metadata factories for processing.

For the current set of tools, this design is sufficient, as all input is done by pars-ing and processing source code. If language elements are added by constructing theobject, and adding it to the model, currently no metadata is attached. Examplesof tools that do not require input from source code are CASE tools and refactoringtools. This becomes problematic if multiple tools use the same data model, as onetool may introduce elements that do not contain metadata required by other tools.A future approach may be to pass elements to the metadata factories when theyare added to the model to ensure that all metadata is present.

4.4 Language Modules 119

Figure 4.9: A support layer for reusing language constructs.

Binary Input

Being able to construct a model from source code, however, it not always sufficient.Applications are not completely written from scratch, but make extensive use ofthe standard library of a language, and of third-party libraries. If the languageuses a form of header files like C++, there is no problem. By also reading theheader files of the libraries, a complete model of the program can be constructed.But if there are no header files, like for example in Java, the model factory mustalso allow binary files as input.

Because we had no grammars for binary parsers for e.g. Java and C#, we havecreated header files for the standard library. We wrote a tool in both languagesthat takes a list of fully qualified class names, and writes a source code file thatrepresent the interface of the class. The list of classes in the standard library iseasily obtained from the development kits of the languages.

4.4.2 Language Constructs

A language module must of course provide classes that model the language con-structs of that model. To promote reuse, classes for language constructs and se-mantics should be gathered in an intermediate support layer, as shown by Figure4.9. For example, since the while statement has the same meaning in Java andC#, a single class can be used by both languages.

If the language has constructs that are not available in the element reposi-tory, the language module must either provide them, or map them onto existingconstructs. The latter technique is discussed in more detail in Section 4.5.1.


It is important that the framework provides enough abstractions to keep thecreation of language elements manageable. If there is no abstraction under whichthe element can be placed, more work is required.

Take for example anchored exception declarations. In the first stages of theframework, it did not offer many abstraction above the Java language, and thusthe exception specification was modeled as a collection of type references. To addanchored exception declarations, abstractions such as exception clauses and ex-ception declarations had to be added. It is possible to create a subclass of Methodand add these elements, but the old list of type references cannot be removed. Italso means that tools that query a method about its exceptional behavior, e.g.to verify if it conforms to the exceptional behavior of another method, will notget a correct answer since the tools can only obtain the list of absolute exceptiondeclarations, which may or may not be used by the subclass of Method. Therefore,the required abstractions were added to the framework and not in the languagemodule. This underlines the importance of collaboration between framework de-velopers and language developers, as it is impossible to anticipate every possiblelanguage construct in advance.

4.4.3 Language Semantics

The language module must also provide classes to represent the semantics of theprogramming language. It must provide grammar rules, well-formedness rules, typ-ing rules, and evaluation rules. The details and design are explained in Section4.5.3, as the abstractions that are involved are a part of the framework.

4.4.4 Output Module

The output module converts a model, or part of it, into a string. It can be used toupdate a code editor, or write the program to a set of files.

4.5 The Chameleon Framework

In this section, we discuss how the Chameleon provides the requirements of the pro-gramming tools and language modules. The main responsibility of the frameworkare providing abstractions, managing metadata, and infrastructure to incorporatethe language semantics.

4.5.1 Abstractions

One of the challenges of the framework developer is to offer abstractions that arespecific enough for the tools to do their job, and general enough to cover a widerange of programming languages. In this section, we discuss the most important

4.5 The Chameleon Framework 121

abstractions for the language constructs. The abstractions for the semantics arepresented in Section 4.5.3.

Layers of Abstraction

Choosing the right level of abstraction in the framework is very important. Toomuch abstraction makes it difficult to write tools, too little abstraction makes itdifficult to implement new language constructs and programming languages.

Our experience has shown us that a single level of abstraction is not suffi-cient. The code editor discussed in Section 4.3.1, for example, can work with anyprogramming language, whether it is an object-oriented, functional, or logic pro-gramming language. It has only knowledge about namespaces, compilation units,and elements in general. More specific tools, such as our CASE tool, need moreinformation. The CASE tool also knows about types and inheritance relations.Tools for other families of programming languages require yet another view on amodel. If Chameleon would provide only a single view on programming languages,that view would be a spaghetti of all kinds of widely varying concepts which iscompletely useless for programming tools.

The solution is to use layers of abstraction. The top layer only contains elementsthat are present in every programming language, such as compilation units, names-paces. More specific layers provide additional elements for a particular family ofprogramming languages. This approach gives tools the choose between supportingany language, a specific family of languages, or even just a single language. Figure4.10 illustrates the approach, but is by no means a concrete design.

As only the object-oriented layer is currently implemented, the remainder ofthis section will explain the framework design for object-oriented programminglanguages.

Unified Structure

One of the requirements of the code editor was a unified view on the model. Thestructure of a model is similar to that of an abstract syntax tree, and followsthe lexical structure of a program. An important difference however, is that allelements are connected through bidirectional associations. Since the same modelwill be used for all kinds of programming tools, navigation in both directionsis essential. These associations are implemented in the class Element in Figure4.11, which is the top class for every language element in the metamodel. Theassociations with the children are left to the specific elements because each elementcan have a different number of associations with its children.

All associations are made using classes similar to the association componentsin Chapter 3, which means that it is impossible to break the referential integrityof a model. This is important because bugs caused by a corrupt model are hardto find, and because correctly implementing such associations is not trivial. For


Figure 4.10: Layers of abstraction.

Figure 4.11: The top-level class of model elements in Chameleon.


example, all handcoded bidirectional associations – which were written by masterthesis students – were implemented incorrectly. The bugs caused by the resultinginconsistencies in the models were very difficult to find according to the studentswho used the code.

Aside from the lexical structure, a program also has logical structures. Forexample the namespaces of types form a logical structure which is populated bydifferent compilation units (files). Another example is the inheritance structure.Contrary to abstract syntax trees, the Chameleon framework also incorporatesthese logical structures. While an abstract syntax tree also contains the namesof superclasses and namespaces, the link itself is not present; without externalknowledge it is impossible to retrieve the type and namespace hierarchies. Thelogical structures are not modeled with regular references, but with crossreferencesand a lookup mechanism, as described in Section 4.5.3.

It is important to give the metamodels a unified structure that can be exposedin the interface of the framework. Without that structure, both a programming tooland the framework itself will have trouble navigating through models of differentlanguages.

Figure 4.12 shows the simplified top-level structure of Eiffel, Java, and C#.Eiffel has a flat namespace concept for types. Java on the other hand, has a hi-erarchical namespace formed by packages which, according to the language spec-ification, contain compilation units. C#, however, has the most flexible structurefor populating its hierarchical namespace with types. In every compilation unit,types can be added to any namespace by nesting namespace declarations. As such,namespaces are more explicitly treated as a logical structure that exists along thelexical structure of compilation units.

These structures can be unified by using the C# structure for Eiffel and Java.In Eiffel and Java, every compilation unit gets a namespace declaration associatedwith the root of the namespace.

Note that this structure is not yet usable for e.g. open classes or partial classes,where the definition of a class can be spread over multiple compilation units. Toallow that, the current class for representing types, which is now completely definedin a single lexical block, must be treated similar to namespaces and be split up intypes and type declarations.

Mapping Elements

One of the lessons learned from the implementation of an early metamodel for Javafrom which the framework has emerged, was that literally following the languagespecification of a language can lead to a disaster. It results in lots of extra classesfor the different kinds of types and expressions, which would make the Chameleonapproach very difficult in practice. The most obvious reason is that implementingall these classes for a specific language modules takes a lot of effort. But a biggerproblem is the number of concepts that have to be provided by the framework


Figure 4.12: Top-level structure of Eiffel, Java, and C#.

such that tools can do their job. The more different classes exist for modelingessentially the same concept, the bigger the tool extensions will be, the more oftena tool extension will be required, and the more code is duplicated in different toolextensions. As a result, the entire approach would be much harder to use.

For example, value types are modeled in different languages as:

• expanded types in Eiffel

• boolean, byte, short, int, long, char, float, double in Java

• structs in C#3

So if tool A needs to know if a certain type is a value type, it must create a toolextension which must be implemented for each language in order to use the tool forthat language. And if independently developed tool B needs the same functionality,the functionality must be implemented again.

These problems can be alleviated by mapping identical concepts to identical– and often generalized – classes in the Chameleon framework and the supportlayer. This approach allows the implementation class to be reused across differentprogramming languages, and allows programming tools to use a single generalconcept in the framework instead of relying on tool extensions.

3In C#, int, long, . . . are aliases for structs in the System namespace.


Concepts are not always mapped to a concept that exist in the programminglanguage in use. If the language does not offer a proper target, the concept ismapped on a concept of another language. Examples are presented further on inthis section.

A crucial property of the mapping is that it must be a 1-on-1 mapping. Thismeans after transforming the source code of a program to a model, it must also bepossible to transform the model back into source code after the tool has done itsjob. For example, it is not valid to map all plus expressions to a static method plus

in a class Math which is artificially created during the construction of the model.After all, it is legal to explicitly declare such a class in your programming, whichmeans that two different concepts can be mapped to the same concept, making itimpossible to correctly perform the reverse mapping.

The technique alone, however, is not enough to solve the problem. It mustalso be used by every language module to be effective. If a tool cannot rely onthe fact that a certain concept is modeled in a given way, it must still resort totool extensions. Therefore, a standard must be created to determine how certainconcepts must be modeled in a family of programming languages. We will nowdiscuss a two mappings to illustrate the principle and the impact of the technique.

Modifiers Modifiers are an easy and frequently used technique to keep the size ofa metamodel manageable. As argued above, different languages model value typesin different ways. In the metamodel, a class is given value semantics by adding aValueType modifier. Instead of creating different subclasses of Type to representstructs in C#, expanded classes in Eiffel, and primitive types in Java, they are allobjects of class Type, and have a ValueType modifier. Similar modifiers exist todenote if a class is an interface or an enumerator, or if a method is a constructor,destructor, prefix operator, postfix operator, or infix operator.

Expressions Whenever possible, an expression is mapped to a method by ex-ploiting the fact that the keywords for the expressions cannot be used in any otherway. For example, instead of creating separate classes for all the arithmetic ex-pressions in Java, we map them onto prefix, postfix, or infix operators, as done inEiffel. During the creation of the model, artificial classes are added to the model torepresent the primitive types. After that, these classes are populated with methodsrepresenting the different kinds of arithmetic expressions.

Figure 4.13 illustrates what happens when the Java language specification isfollowed strictly. A large number of classes must be written in order to modelthe expressions of the language. Note that these are not all Java expressions,and that intermediate classes have been removed from the hierarchy. The rightpart of the figure shows the object representation of a + expression. Figure 4.14illustrates what happens when expressions are mapped to methods. The classes forthe different expressions have disappeared, and instead, methods are added to the


Figure 4.13: Strictly following the Java specification.

appropriate types to represent the expressions. The invocation of a + expressionis now represented by a method invocation taking the left operand as a target,and the right operand as its argument.

Note that not all expressions can be transformed into methods at this moments.For example the && and || methods are not mapped to methods because they havea different semantics. If the first argument evaluates to true, the second argumentis not evaluated. Until the framework supports lazy evaluation, for example byusing call-by-name parameters, invocations of these expressions cannot be correctlymodeled as method invocations.

4.5.2 Metadata

Modeling only the programming language is not enough to make the frameworkusable. It must also allow arbitrary metadata to be stored in the model. Theconstruction used for that is the same as in MOF [OMG06]. Every element canhave arbitrary Tag objects associated with it. Every tag is stored under along withan identifier, which can be used by the tool to retrieve the tag later on. Because theframework does not know what metadata will be attached, the Tag class itself onlyimplements the association with Element. Figure 4.15 shows the class diagram ofTag and Element, along with an example of the code editor. The textual positionof the elements in the source code is attached to that element as metadata suchthat the data model of the editor and the Chameleon model can be synchronized.Another use for metadata in the code editor is to mark cross-references in themodel to tell the editor which elements can be turned into hyperlinks.

Tags are important to keep tool specific functionality out of the metamodels.For example, it may be tempting to include line and column numbers in the ele-


Figure 4.14: Mapping expressions to methods.

Figure 4.15: Attaching arbitrary metadata using tags.


ments themselves, but they are not part of the semantics of a program – althoughin indentation sensitive languages like Python they are related. By using tags,the information can be attached to the elements without having to modify theelements of the language module.

4.5.3 Language Semantics

As mentioned before, it is important that the semantics of the language constructsis incorporated in the metamodels. In this section, we describe what the semanticsare, and what the design issues are. The implementation of the language rules isonly in an initial phase, so we cannot present a final solution.

The semantics of a language are roughly described by four kinds of rules:grammar rules, well-formedness rules, typing rules, and evaluation rules. Grammarrules determine if the structure of a program is valid. For example, they can statethat a class can only contain methods, fields, and other classes. Well-formednessrules determine if the non-structural constraints of the language are satisfied. Forexample, they can state that the types of the formal parameters of a method areinvariant in an inheritance relation. Typing rules determine static semantics ofthe language, for example what type an expression has, what exceptions can bethrown by a piece of code, what the inheritance relations are among the classes,. . . . Evaluation rules determine the run-time semantics of the executable languageconstructs. Given a program state, and an instruction, they state what effect theexecution of that instruction will have on the program state.

Semantic Strength vs. Reuse

An important design decision is how and where the semantics are incorporated inthe metamodel. When working with ASTs, most of the rules are verified or enforcedby code outside of the AST. Only the grammar rules are reflected in the AST ifa heterogenous AST is used. If the tree is homogenous, even these rules must beverified and enforced by external code. As such, the semantics are not incorporatedin the model, resulting in the problems discussed before in Section 4.1. But if yougo to the other extreme, and put all the rules in the classes representing thelanguage constructs, hereby creating a very strong connection between data andsemantics, then it takes a lot of work to create a language module. The languagemodule would have to create a special subclass for each element in the languagein which all the rules for that element, such as the structural constraints, arefixed. For example, it would have to create a subclass of MethodInvocation thatis somehow linked to a rule that specifies exactly what kind of expressions areallowed in that language. After all, according to the specification of the language,those are the only expressions that can be the children of a method invocation.

Figure 4.16 illustrates what happens when the coupling between the data andthe semantics becomes stronger. In the traditional homogenous abstract syntax


tree at the top, data and semantics are completely separated. In the metamodelsat the bottom, data and semantics are completely fixed, resulting in lots of classesin the language modules. The markings between both figures illustrates the changesin semantics strength of the elements, and the size of the language modules. Aswe will see in the next paragraphs, the size of the language modules

We can significantly decrease the amount of work required to create a languagemodule by using a slightly less tight coupling of the data and the language rules.Figure 4.17 illustrates the principle. Instead of fixing all the rules in the elements,some of the rules are put in external objects. These rules are accessed througha class representing the language, a subclass of Language. In the figure, classesCSharp and Java provide such rules for C# and Java. They can either contain therules directly, or provide factories for creating objects representing the rules. Thelookup mechanism is also an example of the latter.

Because the rules are still attached to the model, elements still have only onemeaning within a model. But if the element is moved to another model the usesanother programming language, its meaning may change. For example, if a subclassof Exception is moved from a Java model to a C# model, it will no longer be achecked exception because the Language object decides which class is a checkedexception. Because this is a very unlikely situation, we think it is more importantto keep the language modules small.

Enforcement of Rules

The grammar and well-formedness rules can be verified by the metamodel, butnot all of them are enforced. The reason for this is the difficulty in building andmodifying a model without it ever being invalid. During the development, everyprogram goes through many invalid states before it compiles. As such, a metamodelthat enforces every rule is of little practical use.

Most of the grammar rules are enforced because the structure of the elementsin the model closely resembles the structure of the grammar of the language.These rules are enforced for two reasons. First, it makes no sense to violate them.For example, it makes no sense to write an import declaration in an expression.Second, the deeper an element is in the lexical hierarchy, the more functionality itmust offer. For example, the framework offers methods to facilitate the navigationthrough the model. Elements that can be descendant of a NamespaceDeclaration

offer a method that return the nearest enclosing NamespaceDeclaration. Anotherexample is the calculation of the type of an expression, which can depend on thetype of its child expressions.

Typing Rules

The most important part of the typing rules is the lookup mechanism. It is anessential and often complex part of a programming language, and must not be


Figure 4.16: Placement of the rules.


Figure 4.17: Trading semantical strength for flexibility.

duplicated.

The lookup is performed on demand. If a tool must know which method isreferenced by a method invocation, or what the type of a variable is, the lookupmechanism searches the appropriate element based on the current state of themodel. If an error occurs during the lookup, e.g. because there are multiple can-didates when there should only be one, a MetamodelException is thrown.

The advantage of on-demand lookup is that no work is required to updatecrossreferences if the model is modified. Updating the model after every modifica-tion is not only time consuming, but is also very complex without having to updatethe entire model. A change in name of a type, or even an import declaration, canaffect almost any crossreference in the model.

To make on demand lookup possible, crossreferences are not resolved duringthe creation of the model, but they are reified. For example, when a type name isencountered, a TypeReference object is created, which holds the name of the refer-enced type. If the name consists of more than one identifier, e.g. java.util.List,a separate reference object is created for java, util, and List as shown in Fig-ure 4.18. The java and util references are the targets of respectively the util

and List references. A target is an element relative to which the crossreferenceis written. Going in the other direction, util and List are the lexical parents ofrespectively java and util. The type of the target depends on the type of the par-


Figure 4.18: Reification of crossreferences.

ent. For example, the target of a TypeReference is a NamespaceOrTypeReference,while the target of a MethodInvocation is an Expression. Similar crossreferencesexist for variables, method invocations, and namespaces.

The lookup procedure for a crossreference consists of two phases. In the firstphase, the lookup mechanism determines the context that must be used to findthe element. Context objects are the heart of the lookup mechanism, and providemethods for searching different kinds of elements such as namespaces, types, vari-ables, methods, . . . . They are constructed using factories such that every languagecan use its own lookup mechanism. In the second phase, that context will searchfor a matching element.

If the crossreference has no target, e.g. in the expression m(args), the lookupis done using the lexical context of the crossreference. The lexical context startssearching in the named elements declared by the current element, and continuesto search in the parent element and further ancestors until it has found a match.Depending on the language in use, the search can jump to the namespaces atcertain points.

Lexical context objects are attached to elements that can declare new namedelements. Other elements use the lexical context of their parent. Figure 4.19 illus-trates the placement of the lexical context objects in the metamodel. Note thatsome statements, such as local variable declarations, can also declare named ele-ments and thus have a lexical context of their own.

If the crossreference has a target, the target is asked to give its target context.Depending on the type of the target, that request is delegated to another element.For example, if the target is a crossreference, the request is delegated to the refer-enced element. If the target is an expression, the request is delegated to the Type

of that expression. The resulting target context will search for a match in that


Figure 4.19: Placement of the lexical context objects.

Figure 4.20: Placement of the target context objects.


Figure 4.21: Performing a lookup.

delegatee element, but will not proceed to the lexical parent of that element if nomatch is found.

Target contexts are associated with elements that either can be a target, orbe used as a delegatee by a target, as discussed above. Figure 4.20 illustrates theplacement of the target context objects.

Figure 4.21 illustrates the lookup process. The lookup request getType() issent to the TypeReference representing a reference to java.util.List. The"List" object first queries which element is referenced by "util", which in turnfirst queries which element is referenced by "java". The latter crossreference hasno target, so it is looked up in the lexical context, which is obtained through itsparent reference. The proper lexical context, named lexCtx, is provided by theenclosing elements. This context is used to find the element referenced by the name‘‘java’’, which turns out to be a package. Now, the target context of the java is


used to find util because the crossreference to ‘‘util’’ is relative to ‘‘java’’.In the final step, the target context of util is asked to search for an element withthe name ‘‘List’’

Dynamic Loading

The current lookup mechanism can only find elements that are actually present inthe model. This means that it is often necessary to load all classes in the standardlibrary of a language to ensure that a tool will work. For example, any class in thestandard library can be used in a code editor to implement another class. If theunderlying model does not contain the class, some functionality of the tool will nolonger work because the lookup mechanism may fail.

Loading all these classes, however, takes a long time, and requires a hugeamount of memory. For example, there are about 2200 classes in the Java standardlibrary, and loading them all on a reasonably priced current computer takes about30-60 seconds. The memory consumption is also far too high. The test classesof the Java language module load the code for the standard library, a softwareproject, and the other dependencies of that project. These tests do not run un-less the heap size of the virtual machine is set to 512MB, which is far too highconsidering the small amount of dependencies on the external libraries. Addingsuch memory requirements on top of those of a modern IDE is asking for troubleif interactivity is required.

Therefore, we must create a lookup mechanism that can load files on demand.Just like a real virtual machine needs a classpath, the metamodel needs a sour-cepath. This path must tell the lookup mechanism which elements are available,and which files they are in.

Run-time semantics

For defining the run-time semantics, there are several options. The three ap-proaches [Pie02] for writing evaluation rules are operational semantics, denota-tional semantics, and axiomatic semantics. While denotational semantics and ax-iomatic semantics are more elegant from a mathematical point of view, and bettersuited for deriving properties of a language, operational semantics is a better ap-proach for modern object-oriented programming languages because of its simplicityand flexibility [Pie02]. Therefore it is the best candidate for defining the run-timesemantics of an executable element in Chameleon.

With operational semantics, a language is formalized by defining a machine thatdirectly evaluates the terms of the language. During the evaluation, the programis rewritten and the state, which represents the memory is updated. There are twostyles for performing the evaluation itself: small-step style, also called structuraloperational semantics, and big-step style, also called natural semantics.


e0 → e′0 (target)e0 : T.m(e) → e′0 : T.m(e)

ei → e′i (args)v0 : T.m(v1,...,vi−1,ei, . . .) → v0 : T.m(v1,...,vi−1,e

′i,...)

mbody(m, T, C) = x.e0(comp)

v:T.m(v) → [v/x, v/this]e0

Figure 4.22: Small-step evaluation.

e0 →∗ v0 e →∗ v mbody(m, T, C) = x.b0

e0:T.m(e) → [v/x, v/this]b0

Figure 4.23: Big-step evaluation.

In the small-step style, all the basic steps involved in the evaluation of anexpression are defined separately. This allows the behavior of a language to bedefined very precisely because the evaluation order is taken into account, anderrors can accurately be dealt with. Take for example the evaluation rules formethod invocations from Figure 4.22. Rule target evaluates the target until itcan no further be evaluated. Rules args and comp do not match until the targetis a value. If the target has been computed, the arguments are evaluated fromleft to right be rule args. Again, rule comp does not match until all argumentsare computed. Finally the actual method invocation is performed by rule comp.If exceptions are also modeled, it is easy to specify that further subexpressionsare not evaluated. Note that these rules are different from the evaluation rules inChapter 3, which are based on Featherweight Java. They do not have to specifythe evaluation order because the language has no side-effects.

In the big-step style, which is illustrated in Figure 4.23, each executable elementis evaluated in a single step. The machine assumes that the children have beenevaluated, and describes the result of evaluating the current element in terms ofthose results. The rules are more abstract than small-step rules, but they are alsoless precise. This style makes it problematic to define the order of execution andto deal with execution errors.

To express the runtime semantics, an abstract machine is required. This ma-chine must provide the required run-time information such that the effect of exe-cuting a certain instruction can be specified. Because the framework must promotethe reuse of language constructs across languages, and because of implementationrequirements, it is necessary to choose a single style for the evaluation rules, and a

4.6 Related Work 137

single configuration for the abstract machine. After all, mixing different styles andabstract machines is very difficult, if not impossible. The machine must provideat least a stack and a heap, but the exact configuration of the machine remainsfuture work since it must be suitable for a much larger range of languages than iscurrently supported.

Interpreter

By defining the run-time semantics, the language developer also implements aninterpreter. By reusing as much existing language constructs as possible, this al-lows the developer to easily experiment with new language constructs and newlanguage rules. This approach can be much more convenient than writing a pre-processor that translates programs in the new language to an existing language.For example, creating a preprocessor for the component relation from Chapter 3that transforms programs to Java turned out to be too complex. The inheritancemechanism of the target language is so primitive that the translated programcontains an enormous amount of low-level code to simulate the behavior of thecomponent relation. Directly implementing the lookup mechanism from the typesystem, which is presented in Appendix B, is much easier.

Current Support

Currently, only the grammar rules and the typing rules are supported by theframework. The grammar rules are supported through constraints on the rela-tionships among the elements. For example, a method invocation can only haveexpressions as actual arguments. As mentioned in Section 4.5.3, not all grammati-cal constraints are part of the language constructs themselves. There is no specialJavaMethodInvocation that can only have Java expressions as actual arguments.The Language object will determine which constructs are part of the language,and which are not. The typing rules are supported through regular methods suchas getType for expressions, and by the lookup mechanism.

4.6 Related Work

Because of the relatively early stage of the project, the study of related work isnot as complete as in the other chapters.

4.6.1 Extensible Compilers

Polyglot [NCM03] is an extensible compiler framework for Java. It uses PPG andthe CUP LALR [HFA06] parser generator to provide support for extensible gram-mars. The generated parser creates an abstract syntax tree using nodes that are


specifically created to model the Java programming language. Node traversal isdone using the Visitor [GHJV95] pattern.

To allow extension of a compiler, delegate objects are used. Every node in theabstract syntax tree has a field del that is used to redirect method calls if its valueis not null. By providing a new node factory, a language extension can introducecustom extensions for the node types affected by that extension.

To allow extension of the interface of a node, extension object are used in a waysimilar to the delegate objects. When the extension needs to access the additionalfunctionality, it queries the extension object of the node, and performs a cast.

Using this technique, Polyglot partially solves the problem of duplication ofthe language semantics, but it has a number of drawbacks. By using the delega-tion patterns described above, the programmer basically writes his own methoddispatch. It is caused by a limitation of the Java programming language, but aproblem nonetheless. A more important drawback of Polyglot is that it still limitsprogramming languages to extensions of Java because there is no layer of abstrac-tion above the Java layer. In addition, it limits its functionality to the constructionof compilers.

JastAdd [HM03] uses aspect-oriented programming to weave different partsof the compiler, such as the type checking, name analysis, and code generationtogether. The compiler code is woven into the AST classes.

JaCo [ZO01a] is another extensible compiler that uses extensible algebraic datatype with defaults [ZO01b] to extend the compiler.

4.6.2 Meta Object Facility

The Meta Object Facility (MOF) [OMG06] is a standard for managing metadata,and is mainly used for managing metamodels of modeling languages and modelsthat are instances of these metamodels.

MOF has a four layer architecture, which is illustrated in Figure 4.24. Thebottom layer is called M0, and is the place where the actual objects of the softwaresystem live. These can be bank accounts in a banking application, or the actualtables in a database. The elements of layer M0 are instances of elements of layerM1. The elements of layer M1 form the model of the software system. These modelscan be a UML class model for the banking application, or an ER model for thedatabase. The elements of layer M1 are instances of elements of layer M2. Theelements of layer M2 form the metamodel or the modeling language. For examplethe Unified Modeling Language is described by a model at level M2. That modeldescribes that a UML diagram can contain classes, constraints, associations, . . . .The elements of layer M2 are instances of elements of layer M3, which contains ametametamodel. The metametamodel describes a basic modeling language whichis used to create other modeling languages. An interesting property of the modelat level M3 is that it is self-describing. The model is an instance of itself. Withoutthis property we would need an infinite amount of layers. Note that the division

4.6 Related Work 139

Figure 4.24: The four-layered architecture of MOF.

into different levels is arbitrary, and is only meant to clearly separate the differentparts in the architecture.

MOF is not just for creating models of object-oriented modeling languages, butcan be used to create any kind of modeling language. The language in which thesemodels are described, however, is object-oriented because the metametamodel atlevel M3 described an object-oriented core language.

The models at level M2 are used to generate code for managing repositoriesof models (level M1). The API of the generated code supports so called CRUDoperations: create, read, update, and delete. Depending on the generator, the im-plementation can be accessed via Java, CORBA, . . .

The implementations can also enforce a part of the semantics of the models.For example, if a class owns its attributes, the implementation will remove the


attributes when the class is removed from the model. The implementation canalso enforce part of the OCL invariants of the model, depending on the generator.

A problem with MOF is the lack of expressivity of the metametamodel at levelM3. Its inheritance mechanism does not allow definitions to be overridden, whichmakes it impossible create a framework at the level of M2. As a result, when usingMOF as it is intended, a tool must choose between usability and reusability. If thetool works at level M3, it can provide general CRUD operations, but then it is notnearly as usable as a dedicated tool. If the tool works at the level of a specific M2model it cannot be reused for another language.

Another problem is that the generation of the implementations based on OCLis incomplete and not standardized. This means that the implementation does notnecessarily contain all the semantics of the modeling language. The missing partsagain have to be duplicated by the programming tool.

4.6.3 Extensible IDEs

There are numerous IDEs that can be extended with support for additional pro-gramming languages, such as Eclipse, Komodo, Visual Studio, IntelliJ, and Net-beans.

Eclipse offers a platform for creating IDEs. By default it includes a full envi-ronment for Java, which is provided by JDT, which stands for Java DevelopmentTools. It provides editors, wizards, outline views, refactoring tools,. . . . Adding sup-port for languages that are based on Java can be done by reusing JDT, as doneby the authors of Scala, but even that is difficult due to the lack of documentation[MO]. Adding support for a completely different language is even more difficultbecause all the functionality of the JDT must be reimplemented as it is tied tothe Java language.

The Komodo IDE [Act07] also offers support for multiple languages. Additionallanguages can be added with User-Defined Language files which themselves arewritten in the Luddite language. The UDL file can only describe the syntax of thenew language, so only basic textual functionality like syntax highlighting and codefolding can be supported. More advanced functionality such as refactoring, codebrowsing through hyperlinks,. . . cannot be provided based on just a description ofthe syntax.

4.6.4 Mirrors

Mirror based systems [BU04] provide a structured way for using reflection. Theyoffer interfaces that represent elements of the programming language such as types,methods, and variables. These can then be used in a program to invoke methods,construct object, . . . . Mirrors differ from Chameleon in that they are used for meta-programming in the current run-time environment using an interface that is specificto the current programming language. Chameleon cannot be used for this purpose,

4.7 Conclusion 141

as its models are independent of the run-time and programming language. Mirrors,on the other hand, cannot be used to create language independent programmingtools.

4.7 Conclusion

The traditional approach for writing programming tools is far from sufficient. Theseparation of the data model of a program and the semantics of the language con-structs leads to duplication of the language semantics, and ties tools to a limitedset of supported languages – usually one. We have shown that the MOF technologycannot fully deal with these problems. The language semantics cannot be incorpo-rated completely in the data model because of limited code generation technology,and the modeling language at level M3 is not powerful enough to create a practicalframework. Extensible compilers allow compilers to be written by reusing existingcompilers, but they do not offer the abstractions a programming tool needs to belanguage independent.

Chameleon is a framework for metamodels of programming languages. It pro-vides general abstractions for language constructs which are instantiated by actuallanguage modules. By using the abstractions of the framework, which include thelanguage semantics, many programming tools can perform their task independentof the used programming language. Chameleon offers different layers of abstrac-tion for different families of programming languages to allow advanced tool to bewritten for a particular family without being language dependent. If the providedabstractions do not suffice, a tool extension can be defined. This extension mustthen be implemented for each language that must be supported by the tool.

To make the approach work, it is important that developers of programmingtools, programming languages, and the Chameleon framework work together. Ifnew abstractions are required, they must be provided by the appropriate moduleto maintain the high reusability of both programming tools and programminglanguages.

While the framework is not yet complete, which it may never be due to con-tinuous advances in programming language, we have shown that the approach isviable. We have implemented concrete language modules for Java 1.2, C# 1.0,and Cappuccino – Java with anchored exception declarations – together with apartial implementation for Ruby. In addition, we have implemented a number oftools using Chameleon. The most advanced tool is a language independent codeeditor for the Eclipse IDE. It dynamically loads the installed language modulesand offers advanced functionality like syntax highlighting, coding folding, outlineviews, syntax error reporting, hyperlinks, . . . for programs written in one of theinstalled languages.

Part IV

Conclusion

143

Chapter 5

Conclusions and Future

Work

I’ll never make that mistake again, reading the experts’ opinions.

Richard Feynman

In this Chapter, we list the contributions of this thesis, we present futureresearch topics, and finally we present the conclusion.

5.1 Contributions

Exception Handling

In the first part of this thesis, we have shown that problems with checked excep-tions, such as reduced adaptability and loss of context information, are caused bythe lack of expressivity of the exceptional return type of a method.

We solved the problem by introducing anchored exception declarations, whichallow the exceptional behavior of a method to be declared relative to other meth-ods. They bring the exception clause on par with the implementation and thenormal specification of a method in terms of expressivity, resulting in better adapt-ability of software, more elegant code, and elimination of most of the dangerousexception handlers. As a result, they contribute to the development of reliablesoftware by reducing the overhead of checked exception.

We have defined the formal semantics of anchored exception declarations, andthe rules they must adhere to in order to ensure compile-time safety, and haveproved that the approach is compile-time safe.

On the methodological side, we have shown that anchored exception declara-tions do not violate the principle of information hiding when used properly, and

145

146 Conclusions and Future Work

have presented a guideline for when to use them, and when not to use them. Inaddition, we defined criteria to determine which modifications caused by the evo-lution of the exceptional behavior of a method are good and which modificationsare gratuitous.

Finally, we have implemented anchored exception declarations in Cappuccino,an extension of ClassicJava. A translator validates Cappuccino programs andtransforms them into Java programs.

Composition of Abstract Data Types

In the second part of this thesis, we have shown that current object-oriented pro-gramming languages do not offer the abstraction level required to use generalpurpose classes as building blocks for other classes in a practical manner; to usethem as abstract data type components. This prevents a developer from reusinghigh-level concepts like associations, bounded values, graphs, . . . .

We have created a list of requirements to encapsulate and reuse such concepts,we categorized the requirements, and showed how current reuse mechanisms sup-port them. This study reveals that existing mechanism lack important featuresthat have a big impact on the expressivity, and that most mechanisms do not evenmeet the minimal requirements.

The inheritance mechanism we developed is the first to make the use of abstractdata type components practical. Renaming parameters and first-class componentrelations eliminate the problems encountered with existing mechanisms. They al-low a programmer to easily exploit name patterns, connect components, provideboth a simple class interface and lots of functionality, and use components as ifthey were separate objects. Together, these improvements raise the abstractionlevel of the programming language, since it is no longer required to create a newlanguage construct or write lots of low-level code to reuse a high-level charac-teristic. Of course, the component relation can also be used for traditional codeinheritance as used in traits, Eiffel, SmartEiffel, Cecil, . . . .

A case study confirms that our inheritance mechanism yields much better re-sults (21% to 36% reduction) than other inheritance mechanisms (3% to 12%reduction), and delegation (11% to 17% reduction). It also shows that our in-heritance mechanism is more robust with respect to extensions of components.In addition, it is still beneficial to reuse small components, or small parts of bigcomponents with our inheritance mechanism, contrary to the other techniques.

Programming Language Development

In the third part of this thesis, we have seen that the traditional approach forwriting programming tools is far from sufficient. The separation of the data modelof a program and the semantics of the language constructs leads to duplication ofthe language semantics, and ties tools to a limited set of supported languages –

5.2 Future Work 147

usually one. We have shown that the MOF technology cannot fully deal with theseproblems. The language semantics cannot be incorporated completely in the datamodel because of limited code generation technology, and the modeling languageat level M3 is not powerful enough to create a practical framework. Extensiblecompilers allow compilers to be written by reusing existing compilers, but they donot offer the abstractions a programming tool needs to be language independent.

Based on our experience with the development of programming languages andprogramming tools, we built Chameleon, a framework for metamodels of pro-gramming languages. It provides abstractions for language constructs which areinstantiated by actual language modules. By using the abstractions of the frame-work, which also offer access to the language semantics, many programming toolscan perform their task independent of the particular programming language inuse. Chameleon offers different layers of abstraction for different families of pro-gramming languages to allow advanced tool to be written for a particular familywithout being language dependent. If the provided abstractions do not suffice, atool extension can be defined. This extension must then be implemented for eachlanguage that must be supported by the tool.

To make the approach work, it is important that developers of programmingtools, programming languages, and the Chameleon framework work together. Ifnew abstractions are required, they must be provided by the appropriate moduleto maintain the high reusability of both programming tools and programminglanguages.

While the framework is not yet complete, which it may never be due to con-tinuous advances in programming language, we have shown that the approach isviable. Concrete language modules have been implemented for Java 1.2, C# 1.0,and Cappuccino – Java with anchored exception declarations – together with apartial implementation for Ruby. In addition, a number of tools have been imple-mented using Chameleon. The most advanced tool is a language independent codeeditor for the Eclipse IDE. It dynamically loads the installed language modulesand offers advanced functionality like syntax highlighting, coding folding, outlineviews, syntax error reporting, hyperlinks, . . . for programs written in one of theinstalled languages.

5.2 Future Work

Exception Handling

While anchored exception declarations solve most of the problems of using checkedexceptions, there is still room for improvement.

At this moment, an anchored exception declaration can limit the set of excep-tions that are propagated, but it cannot express the transformation of one typeof exceptions into another, which can be necessary when crossing the boundaries


of a component [RM00]. A construct to express this would allow for a more finegrained specification of the exceptional behavior of a method, and could look likethis:

NewException like MethodExpression signals (OldException)

In addition, the algorithm for calculating the implementation exception clause canbe improved. For example, at this moment it does not retain information about theorigin of a checked exception if it is caught, and then raised again. It will treat theraised exception as if it can be signalled at any time and discard possible anchorrelations. The exception flow analyses discussed in Section 2.9 are more precise.

The expressiveness of anchored exception declarations is limited in the sensethat they take only static type information into account. Information about theexact conditions under which certain exceptions can be signalled still have to beprovided by specifications. Anchored exception declarations are complementaryto traditional specifications, and can be added to existing specification languages,such as JML and Spec# [BLS04], in order to enrich their expressiveness regardingexception handling.

Composition of Abstract Data Types

The most important future work is the construction of a compiler for the proposedinheritance mechanism. This will allow us to experiment with building libraries ofreusable high-level characteristics. These include, but are not limited to, a hier-archy of association classes allowing choices like multiplicity, value or referencesemantics, mutability, constraints. With these associations, graphs can be built tomake any iteration over the object structure a trivial task by incorporating JoostVisser’s work on visitor combination and traversal control [Vis01].

The error handling strategy of class is fixed at this moment. For example, classBoundedValue must choose how to deal with invalid input: use preconditions,throw exceptions, or provide a default behavior. That means that to provide allchoices to an application developer, we need three versions of the same characteris-tic. It would be more interesting to have a single version that provides a number ofstrategies for dealing with errors, and allowing the application developer to chooseone.

A very interesting track is the integration of complementary inheritance tech-niques. The abstract super classes of mixins [BC90] allow a developer to createreusable refinements – classes that wrap their super class to customize its behav-ior. This is not possible with our inheritance mechanism. Dynamic delegation asprovided in Darwin [Kni99b] also offers interesting complementary functionality.Together with our inheritance mechanism, it can be used to provide better supportfor e.g. the state, adapter, and decorator patterns.

5.2 Future Work 149

Programming Language Development

There is a lot of future work to be done for the Chameleon framework. The twomost important features are support for different kinds of parametric polymor-phism, and support for incorporating all language rules in the metamodel.

Parametric polymorphism – also known as generics in Java – has become amandatory feature for statically typed object-oriented programming languages.The construct differs from the existing constructs in Chameleon in that it usesthe concept of derived elements. All existing elements in the framework and thelanguage modules represent lexical elements; elements that are physically part ofthe program. An instance of a template is not physically part of the program,but the template itself is part of it. For example, there is no class List<Element>defined in any Java program; only the template List<T> is defined, where T is ageneric parameter. The most important issue is what information is kept in theinstances of a template, and how they are synchronized with that template. Forexample, if a method m is added to List<T>, it is also available for List<Element>,but is it the same object, a clone, or a proxy? In the current design, proxies arecreated for the elements contained in a template. The proxies keep a reference tothe mapping of the generic parameters, and delegate the other tasks to the originalelements to prevent duplication of information.

As already mentioned, the semantics of the language elements must be incor-porated in the elements of the metamodels. Currently, only the grammar rules andthe typing rules are supported by the framework. The well-formedness rules will beprovided both by the elements themselves and a class representing the program-ming language. The most challenging rules are the evaluation rules. They definethe run-time behavior of executable language constructs. The biggest challengehere is creating an abstract machine that offers enough functionality to define therules for a broad range of programming languages.

Further generalization of the framework is also an interesting track. Currently,the framework is focused on object-oriented programming languages, but support-ing other kinds of languages such as functional and logical programming languagesmust be possible. This will probably result in a separation of the current frame-work into several layers. The top layer will contain only those elements that areavailable in every language, such as Elements and CompilationUnits. Below that,there will be layers for different families of programming languages, and layers cap-turing similarities between those families. General tools, such as the code editor,will be able to work with the top layer. More specialized tools, such as a class di-agram editor, will focus on a specific family of languages, such as object-orientedlanguages.


5.3 Our Vision on Programming Language Devel-

opment

In this thesis, we have shown that the reusability of object-oriented software canstill be significantly increased by new language constructs. But not only the pro-gramming languages still have potential to improve. In our experience, the devel-opment process for object-oriented programming languages itself also has room forimprovement.

5.3.1 Conceptual Development Phase

A first improvement is a stronger focus on methodology. As we have illustratedin the introduction in Chapter 1, the most important advances in the evolutionof programming languages have been of a methodological nature. It were theseadvances that made programs easier to write, understand, and maintain. Thisdoes not mean that the methodological principles are written in stone, but theyare far too often violated without even motivating why. The language constructsresulting from such an approach almost certainly lead to problems. We think thisthe wrong approach as the field of software engineering is meant to turn softwaredevelopment into a real engineering discipline where a high quality product canbe made according to a plan using a reliable development process.

Linearized inheritance is a fine example of this. The technique can be useful ifapplied in the right circumstances, as shown in [BC90], but using it as the onlyinheritance mechanism in a language is asking for trouble, as it hides programmingerrors instead of reporting them.

A second improvement is a more thorough treatment of the problem statement.Describing and solving a technical problem is only the beginning, but the questionwhy there is a problem in the first place is often lacking, while it can lead to amore fundamental understanding of the problem. The resulting insight is by itselfalready a valuable contribution, but it also has other benefits.

It can expand the applicability domain of a solution by expanding the problemdomain. For example, the problem domain for anchored exception declarationsevolved from “the strategy pattern” to “any method invocation”. This in turnlead to the insight that there is a fundamental conflict between the implementationand the normal specification of a method on the one hand, and the exceptionalspecification of a method on the other hand.

Similarly, solutions for other problems that have the same underlying cause canhelp to solve the problem at hand. For example, the similarity between anchoredexception declarations and anchored types in Eiffel revealed the issue of loops inthe chain early in the development cycle.

Finally, and this is more important than it may seem, it is required to verifyif the problem has actually been solved. Striking examples are the famous dia-

5.3 Our Vision on Programming Language Development 151

mond problem for multiple inheritance [Sny86], and the accompanying popularquote by Alan Snyder [Coo88]: “Multiple inheritance is good, but there is no goodway to do it”. Both are frequently and incorrectly used to dismiss multiple in-heritance altogether. Ironically, the arguments in [Sny86] are correct, which haslead many people to believe that the conclusion is still correct. But the conclu-sion only holds if the assumptions under which the arguments were made stillhold. In the arguments, the inheritance mechanism is used both for subtypinginheritance and pure code reuse, which is understandable, as object-oriented pro-gramming was not as well understood as it is today. But as we have already arguedin the introduction in Chapter 1, there are fundamental conflicts between theseuses of inheritance. Therefore, using a single inheritance relation for subtyping andcode reuse inevitably leads to problems, and thus the criticism was justified. Butmore recent research has shown that a single inheritance relation does not suffice[Cha04, SDNB03, oTC05, CRA+05], and that a separate relation is needed forcode reuse. This separation invalidates an important assumption for the diamondproblem, and by exploiting the differences between both relations, these problemsare solved.

Putting more emphasis on the methodological principles also improves the con-ceptual development of language constructs because it frees the language developerfrom the blinding limitations of existing technology. A language construct is justa technical tool to solve a problem. Another construct may be able to solve thatproblem equally well, but it may also solve problems that the former constructcannot solve. As such, a clear understanding of the problem and the methodolog-ical principles behind both the problem and the solution is much more importantthan a particular solution. Focussing too much on existing language constructs canobscure the real solution. For the language constructs developed in this thesis, thesolution trivially followed from a clear understanding of the underlying problem, aset of methodological principles, and ignoring existing language constructs in theinitial phase.

Take for example the component relation developed in Chapter 3. It did notstart as an inheritance relation, but as an unknown language construct X. Startingfrom an example similar to the BankAccount class of Figure 3.2 in Chapter 3, itwas not difficult to create a list of requirements for maximal reuse with minimaleffort1. Creating features like renaming parameters, first class inheritance relations,indirect inheritance, and component parameters was done by simply meeting therequirements. Of course, integrating the resulting language construct with existingconstructs remained necessary and offered important insights, but it was done onlyafter the semantics were roughly defined. Other approaches such as CaesarJ, Scala,Eiffel, and SmartEiffel mainly use existing technology and are, as shown in Chapter3, far less suitable for the job.

1Initially, the list was of course not as complete as the list in Chapter 3, but it covered themost important requirements.


5.3.2 Theoretical Development Phase

Type systems have two important uses for the creation of languages and languageconstructs. They provide a very compact and complete description of the seman-tics of the language. Their formal nature greatly simplifies the verification of theproperties of a language. Over the years, standard approaches for dealing with typesystems have been defined, which greatly simplify the creation of a type system.An important example is the preservation and progress approach for proving typesafety.

For the language constructs developed in this thesis, the type systems werea great help. In both cases, they revealed problems that we had not thought ofbefore. But the development of the type systems also revealed remaining problemsin the current approach.

The major problem was the reusability of type systems. Some elements, such asexpressions, can easily be added to an existing type system – usually FeatherweightJava or ClassicJava. They are well modularized both in the type rules as in thetype safety proof. Adding an expression requires adding rules for the expression,and proving that it satisfies a number of theorems. Other elements, such as theinheritance mechanism, are more difficult to replace. Modifying FeatherweightJava by replacing the inheritance mechanism not only involved adding rules. Italso required modification of existing rules, and an exhaustive search in the entireproof for conditions that must hold for the inheritance mechanism.

We believe that a framework of type systems can be used to increase thereusability of type systems for object-oriented programming languages. By explic-itly stating the required properties of the different basic constructs of the typesystem, it is possible to replace them by other constructs without having to redothe entire type safety proof. Like Chameleon, the framework may never be com-plete, but that does not make it less useful.

An interesting question is whether or not a formal type system can be formedusing the specification of the elements in Chameleon. For that, a formal founda-tion must be built for the approach. It must be proven that assertions about theelements of a metamodel can be used to make assertions about the correspondingprogramming language.

5.3.3 Technical Development Phase

As we have shown in Chapter 4, the technical development of programming lan-guages must improve as well; both for the creation of language constructs, and thecreation of programming tools.

Experiments with new language constructs are typically done by adding theconstruct to an existing language, and translating programs written in the ex-tended language to the host language using a preprocessor or compiler2.

2The Generic Java compiler used bytecode as output format because the corresponding Java

5.3 Our Vision on Programming Language Development 153

Parser generators that support extensible grammars greatly reduce the amountof work required to create a parser for the new language. For example, adding sup-port for anchored exception declarations to Java was done by extending the Javagrammar, and adding rules for anchored exception declarations. Further support isprovided by annotation based systems. Custom annotations can easily be definedand used in the source code of programs. These annotations are ignored and pre-served by the compiler, such that they can be used by external tools for analysis orcode transformations. Support for creating compilers is provided by several tools.For example, polyglot is a compiler framework for Java, and has been success-fully used to create many compiler extensions. Extensions are made by replacingcompiler steps for new nodes, or nodes with modified semantics.

But the more the language deviates from existing languages, the more difficultit becomes to translate programs to existing languages. If the difference is big,generating the infrastructure to work around the limitations of the host languageis far more difficult than directly encoding the semantics of the construct. Forexample, translating the inheritance mechanism developed in Chapter 3 to Javaproved to be much difficult than writing the formal semantics.

Further completion of Chameleon can improve this situation by using the ap-proach taken for type systems. By encoding the run-time semantics of a languageconstruct in the model, the language developer automatically creates a part of aninterpreter. An executable language can then be created by selecting which lan-guage constructs are present in the language, and providing an input module. Thiscan significantly speed up the process by allowing experiments to be done muchquicker. It also means that modifications can be performed quicker if problems aredetected during the experiments. In addition, it becomes easier to take existinglanguage constructs from different languages and verify what can be achieved bycombining the best constructs into one language.

For the creation of a ‘consumer’ programming language, a compiler must ofcourse be written, and this is where compiler frameworks such as Polyglot arestill required. They can be combined with Chameleon by using Chameleon modelsinstead of abstract syntax trees. Many other tools can be created in a languageindependent manner by using the Chameleon framework, such as the advancedcode editor and the CASE Tool we developed. This can save the language developeran enormous amount of effort, and make the language available for a much broaderaudience.

Closing Remarks

In this thesis, we have created language constructs to increase the adaptability,reusability, and reliability of object-oriented software. Together with many con-

code did not compile. Java compilers before 1.5 did not allow covariant return types, but thebytecode language did allow it.


structs developed by others, they show that a lot of progress can still be made forobject-oriented programming languages. But the development process for individ-ual language constructs and entire programming systems must change to speedup progress. For new languages to be successful, it is essential that they integratemany individual improvements and provide a state-of-the-art programming envi-ronment to make a transition worthwile. We think the work we have done in thisthesis, both in terms of language constructs and language development, is a stepin that direction.

Bibliography

[ACN02] Jonathan Aldrich, Craig Chambers, and David Notkin. Architecturalreasoning in ArchJava. In ECOOP, pages 334–367, 2002.

[Act07] ActiveState. Komodo ide, 2007.

[AD] Jonathan Aldrich and Kevin Donnelly. Selective open recursion: Mod-ular reasoning about components and inheritance. In Proc. FSE 2004Workshop on Specification and Verification of Component-Based Sys-tems.

[AH03] Matthew Allen and Susan Horwitz. Slicing Java programs that throwand catch exceptions. In PEPM ’03: Proceedings of the 2003 ACMSIGPLAN workshop on Partial evaluation and semantics-based pro-gram manipulation, pages 44–54, New York, NY, USA, 2003. ACMPress.

[ALZ00] Davide Ancona, Giovanni Lagorio, and Elena Zucca. Jam - a smoothextension of Java with mixins. In ECOOP ’00: Proceedings of the14th European Conference on Object-Oriented Programming, pages154–178, London, UK, 2000. Springer-Verlag.

[Ame87] Pierre America. Inheritance and subtyping in a parallel object-oriented language. In European conference on object-oriented pro-gramming on ECOOP ’87, pages 234–242, London, UK, 1987.Springer-Verlag.

[AT88] Mehmet Aksit and Anand Tripathi. Data abstraction mechanismsin SINA/ST. In OOPSLA ’88: Conference proceedings on Object-oriented programming systems, languages and applications, pages 267–275, New York, NY, USA, 1988. ACM Press.

[Bar88] H. P. Barendregt. Introduction to lambda calculus. In Aspenæs Work-shop on Implementation of Functional Languages, Goteborg. Program-ming Methodology Group, University of Goteborg and Chalmers Uni-versity of Technology, 1988.

155

156 BIBLIOGRAPHY

[BBG+60] J. W. Backus, F. L. Bauer, J. Green, C. Katz, J. McCarthy, A. J.Perlis, H. Rutishauser, K. Samelson, B. Vauquois, J. H. Wegstein,A. van Wijngaarden, and M. Woodger. Report on the algorithmiclanguage algol 60. Commun. ACM, 3(5):299–314, 1960.

[BC90] Gilad Bracha and William Cook. Mixin-based inheritance. In OOP-SLA/ECOOP ’90: Proceedings of the European conference on object-oriented programming on Object-oriented programming systems, lan-guages, and applications, pages 303–311, New York, NY, USA, 1990.ACM Press.

[BCH+96] Kim Barrett, Bob Cassels, Paul Haahr, David A. Moon, Keith Play-ford, and P. Tucker Withington. A monotonic superclass linearizationfor dylan. SIGPLAN Not., 31(10):69–82, 1996.

[BD96] Daniel Bardou and Christophe Dony. Split objects: a disciplined useof delegation within objects. In OOPSLA ’96: Proceedings of the11th ACM SIGPLAN conference on Object-oriented programming,systems, languages, and applications, pages 122–137, New York, NY,USA, 1996. ACM Press.

[BHJL86] Andrew Black, Norman Hutchinson, Eric Jul, and Henry Levy. Ob-ject structure in the emerald system. In OOPLSA ’86: Conferenceproceedings on Object-oriented programming systems, languages andapplications, pages 78–86, New York, NY, USA, 1986. ACM Press.

[BL92] Gilad Bracha and Gary Lindstrom. Modularity meets inheritance. InProceedings of the IEEE Computer Society International Conferenceon Computer Languages, pages 282–290, Washington, DC, 1992. IEEEComputer Society.

[BLS04] Mike Barnett, K. Rustan M. Leino, and Wolfram Schulte. The Spec#programming system: An overview. In CASSIS 2004 proceedings,2004.

[BM62] R. A. Brooker and D. Morris. A general translation program forphrase structure languages. J. ACM, 9(1):1–10, 1962.

[BM00] Peter A. Buhr and W. Y. Russell Mok. Advanced exception handlingmechanisms. IEEE Trans. Softw. Eng., 26(9):820–836, 2000.

[BP01] Jonthan Bachrach and Keith Playford. The Java syntactic extender(JSE). In OOPSLA ’01: Proceedings of the 16th ACM SIGPLANconference on Object oriented programming, systems, languages, andapplications, pages 31–42, New York, NY, USA, 2001. ACM Press.

BIBLIOGRAPHY 157

[Bra92] Gilad Bracha. The Programming Language Jigsaw: Mixins, Modular-ity and Multiple Inheritance. PhD thesis, 1992.

[BSD03] Andrew P. Black, Nathanael Scharli, and Stephane Ducasse. Applyingtraits to the Smalltalk collection classes. In OOPSLA ’03: Proceed-ings of the 18th annual ACM SIGPLAN conference on Object-orientedprograming, systems, languages, and applications, pages 47–64, NewYork, NY, USA, 2003. ACM Press.

[BU04] Gilad Bracha and David Ungar. Mirrors: design principles for meta-level facilities of object-oriented programming languages. In OOPSLA’04: Proceedings of the 19th annual ACM SIGPLAN conference onObject-oriented programming, systems, languages, and applications,pages 331–344, New York, NY, USA, 2004. ACM Press.

[BW05] Gavin M. Bierman and Alisdair Wren. First-class relationships in anobject-oriented language. In ECOOP, pages 262–286, 2005.

[Car88] Luca Cardelli. A semantics of multiple inheritance. Inf. Comput.,76(2-3):138–164, 1988.

[Car03] Luca Cardelli, editor. ECOOP 2003 - Object-Oriented Program-ming, 17th European Conference, Darmstadt, Germany, July 21-25,2003, Proceedings, volume 2743 of Lecture Notes in Computer Science.Springer, 2003.

[CBLL82] Gael Curry, Larry Baer, Daniel Lipkie, and Bruce Lee. Traits: An ap-proach to multiple-inheritance subclassing. ACM SIGOA Newsletter,3(1-2):1–9, 1982.

[CG90] Bernard Carre; and Jean-Marc Geib. The point of view notion for mul-tiple inheritance. In OOPSLA/ECOOP ’90: Proceedings of the Euro-pean conference on object-oriented programming on Object-orientedprogramming systems, languages, and applications, pages 312–321,New York, NY, USA, 1990. ACM Press.

[CGHS99] Jong-Deok Choi, David Grove, Michael Hind, and Vivek Sarkar. Ef-ficient and precise modeling of exceptions for the analysis of Javaprograms. In PASTE ’99: Proceedings of the 1999 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engi-neering, pages 21–31, New York, NY, USA, 1999. ACM Press.

[Cha92] Craig Chambers. Object-oriented multi-methods in cecil. In ECOOP’92: Proceedings of the European Conference on Object-Oriented Pro-gramming, pages 33–56, London, UK, 1992. Springer-Verlag.

158 BIBLIOGRAPHY

[Cha93] Craig Chambers. Predicate classes. In ECOOP ’93: Proceedings ofthe 7th European Conference on Object-Oriented Programming, pages268–296, London, UK, 1993. Springer-Verlag.

[Cha98] Craig Chambers. Towards Diesel, a next-generation OO languageafter Cecil. Invited talk, the Fifth Workshop of Foundations of Object-Oriented Languages, San Diego, California, January 1998.

[Cha04] Craig Chambers. The Cecil language specification and rationale: Ver-sion 3.2. 2004.

[Cha06] Craig Chambers. The Diesel language specification and rationale:Version 0.2. 2006.

[CHC90] William R. Cook, Walter Hill, and Peter S. Canning. Inheritance isnot subtyping. In POPL ’90: Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages125–135, New York, NY, USA, 1990. ACM Press.

[CJYC01] Byeong-Mo Chang, Jangwoo Jo, Kwangkeun Yi, and Kwang-MooChoe. Interprocedural exception analysis for Java. In Proceedingsof the 16th ACM Symposium on Applied Computing, March 2001.

[CMM06] Dominique Colnet, Guillem Marpons, and Frederic Merizen. Recon-ciling subtyping and code reuse in object-oriented languages: Usinginherit and insert in SmartEiffel, the GNU Eiffel compiler. In ICSR,2006.

[Coo88] S. Cook. Oopsla’87 panel p2: Varieties of inheritance. In L. Powerand Z. Weiss, editors, Addendum to the Proc. OOPSLA-87: Object-Oriented Programming Systems, Languages and Applications, pages35–40. acm Press, New York, NY, 1988.

[Coo89] W. R. Cook. A proposal for making eiffel type-safe. Comput. J.,32(4):305–311, 1989.

[CRA+05] Dominique Colnet, Philippe Ribet, Cyril Adrian, Fred-eric Merizen, and Guillem Marpons. Smarteiffel 2.2, 2005.http://smarteiffel.loria.fr.

[CRL01] Ramkrishna Chatterjee, Barbara G. Ryder, and William A. Landi.Complexity of points-to analysis of Java in the presence of exceptions.IEEE Trans. Softw. Eng., 27(6):481–512, 2001.

[CUCH91] Craig Chambers, David Ungar, Bay-Wei Chang, and Urs Holzle. Par-ents are shared parts of objects: inheritance and encapsulation inSELF. Lisp Symb. Comput., 4(3):207–222, 1991.

BIBLIOGRAPHY 159

[DG87] Linda G. DeMichiel and Richard P. Gabriel. The common lisp objectsystem: An overview. In ECOOP, pages 151–170, 1987.

[DH87] R. Ducournau and M. Habib. On some algorithms for multiple in-heritance in object-oriented programming. In European conferenceon object-oriented programming on ECOOP ’87, pages 243–252, Lon-don, UK, 1987. Springer-Verlag.

[DHHM92] R. Ducournau, M. Habib, M. Huchard, and M. L. Mugnier. Mono-tonic conflict resolution mechanisms for inheritance. In OOPSLA ’92:conference proceedings on Object-oriented programming systems, lan-guages, and applications, pages 16–24, New York, NY, USA, 1992.ACM Press.

[Dij68] Edsger W. Dijkstra. Letters to the editor: go to statement consideredharmful. Commun. ACM, 11(3):147–148, 1968.

[DMVS89] R. Dixon, T. McKee, M. Vaughan, and P. Schweizer. A fast methoddispatcher for compiled languages with multiple inheritance. In OOP-SLA ’89: Conference proceedings on Object-oriented programming sys-tems, languages and applications, pages 211–214, New York, NY,USA, 1989. ACM Press.

[DN02] Ole-Johan Dahl and Kristen Nygaard. Class and subclass declara-tions. pages 91–107, 2002.

[Don90] Christophe Dony. Exception handling and object-oriented program-ming: towards a synthesis. In Proceedings of the European conferenceon object-oriented programming on Object-oriented programming sys-tems, languages, and applications, pages 322–330. ACM Press, 1990.

[DT01] Dominic Duggan and Ching-Ching Techaubol. Modular mixin-basedinheritance for application frameworks. In OOPSLA ’01: Proceedingsof the 16th ACM SIGPLAN conference on Object oriented program-ming, systems, languages, and applications, pages 223–240, New York,NY, USA, 2001. ACM Press.

[ECM02] ECMA Technical Committee 39 (TC39) Task Group 2 (TG2). C#Language Specification. ECMA, 2 edition, December 2002.

[Ern99] Erik Ernst. Propagating class and method combination. In ECOOP’99: Proceedings of the 13th European Conference on Object-OrientedProgramming, pages 67–91, London, UK, 1999. Springer-Verlag.

[Ern01] Erik Ernst. Family polymorphism. In Jørgen Lindskov Knudsen,editor, Proceedings ECOOP 2001, LNCS 2072, pages 303–326, Hei-delberg, Germany, 2001. Springer-Verlag.

160 BIBLIOGRAPHY

[Ern03] Erik Ernst. Higher-order hierarchies. In Luca Cardelli, editor, Pro-ceedings ECOOP 2003, LNCS 2743, pages 303–329, Heidelberg, Ger-many, July 2003. Springer-Verlag.

[FA97a] Manuel Fahndrich and Alexander Aiken. Program analysis usingmixed term and set constraints. In SAS ’97: Proceedings of the 4thInternational Symposium on Static Analysis, pages 114–126, London,UK, 1997. Springer-Verlag.

[FA97b] Manuel Fahndrich and Alexander Aiken. Refined type inference forML. In Proceedings of the 1st Workshop on Types in Compilation,June 1997.

[FFAC98] Manuel Fahndrich, Jeffrey S. Foster, Alexander Aiken, and JasonCu. Tracking Down Exceptions in Standard ML. Technical ReportUCB//CSD-98-996, University of California, Berkeley, February 1998.

[FKF98] Matthew Flatt, Shriram Krishnamurthi, and Matthias Felleisen.Classes and mixins. In POPL ’98: Proceedings of the 25th ACMSIGPLAN-SIGACT symposium on Principles of programming lan-guages, pages 171–183. ACM Press, 1998.

[G+00] James Gosling et al. The Java Language Specification, Second Edition.Addison-Wesley Longman Publishing Co., Inc., 2000.

[Gee05] Jef Geerinckx. Development of a universal code editor for object-oriented programming languages. Master’s thesis, K.U.Leuven, 2005.

[GFF04] David S. Goldberg, Robert Bruce Findler, and Matthew Flatt. Superand inner: together at last! SIGPLAN Not., 39(10):116–129, 2004.

[GHJV95] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. De-sign patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., 1995.

[GHM76] John V. Guttag, Ellis Horowitz, and David R. Musser. The design ofdata type specifications. In ICSE, pages 414–420, 1976.

[GHM+05] O. Gruber, B. J. Hargrave, J. McAffer, P. Rapicault, and T. Watson.The Eclipse 3.0 platform: adopting OSGi technology. IBM Syst. J.,44(2):289–299, 2005.

[GM05] Joseph (Yossi) Gil and Itay Maman. Micro patterns in java code. InOOPSLA ’05: Proceedings of the 20th annual ACM SIGPLAN con-ference on Object oriented programming systems languages and appli-cations, pages 97–116, New York, NY, USA, 2005. ACM Press.

BIBLIOGRAPHY 161

[Goo75] John B. Goodenough. Exception handling: issues and a proposednotation. Commun. ACM, 18(12):683–696, 1975.

[Gri06] Robert Grimm. Better extensibility through modular syntax. In PLDI’06: Proceedings of the 2006 ACM SIGPLAN conference on Program-ming language design and implementation, pages 38–51, New York,NY, USA, 2006. ACM Press.

[GRRX01] Alessandro F. Garcia, Cecılia M. F. Rubira, Alexander Romanovsky,and Jie Xu. A comparative study of exception handling mechanismsfor building dependable object-oriented software. The Journal of Sys-tems and Software, 59(2):197–222, 2001.

[GS94] J. Guzman and A. Suarez. An Extended Type System for Exceptions.In Record of the fifth ACM SIGPLAN workshop on ML and its Ap-plications, June 1994. Also appears as Research Report 2265, INRIA,BP 105 - 78153 Le Chesnay Cedex, France.

[GSSS02] Kevin Glynn, Peter J. Stuckey, Martin Sulzmann, and HaraldSøndergaard. Exception analysis for non-strict languages. In ICFP’02: Proceedings of the seventh ACM SIGPLAN International confer-ence on Functional programming, pages 98–109, New York, NY, USA,2002. ACM Press.

[Gut76] John V. Guttag. Abstract data types and the development of datastructures. In Conference on Data: Abstraction, Definition and Struc-ture, page 72, 1976.

[Hej] Anders Hejlsberg. The trouble with checked exceptions.http://www.artima.com/intv/handcuffs.html.

[HFA06] Scott Hudson, Frank Flannery, and C. Scott Ananian. CUP LALRparser generator, 2006. http://www2.cs.tum.edu/projects/cup.

[HHG90] Richard Helm, Ian M. Holland, and Dipayan Gangopadhyay. Con-tracts: specifying behavioral compositions in object-oriented systems.In Proceedings of the European conference on object-oriented program-ming on Object-oriented programming systems, languages, and appli-cations, pages 169–180. ACM Press, 1990.

[HHP92] III Harry H. Porter. Separating the subtype hierarchy from the inher-itance of implementation. J. Object Oriented Program., 4(6):20–29,1992.

[HJS92] Thorsten Hartmann, Ralf Jungclaus, and Gunter Saake. Aggregationin a behaviour oriented object model. In ECOOP ’92: Proceedings

162 BIBLIOGRAPHY

of the European Conference on Object-Oriented Programming, pages57–77, London, UK, 1992. Springer-Verlag.

[HLW+92] John Hogg, Doug Lea, Alan Wills, Dennis deChampeaux, and RichardHolt. The Geneva convention on the treatment of object aliasing.SIGPLAN OOPS Mess., 3(2):11–16, 1992.

[HM03] Gorel Hedin and Eva Magnusson. Jastadd: an aspect-oriented com-piler construction system. Sci. Comput. Program., 47(1):37–58, 2003.

[HW06] Joeri Hendrickx and Manuel Van Wesemael. Development of a univer-sal code editor with advanced features. Master’s thesis, K.U.Leuven,2006.

[IPW01] Atsushi Igarashi, Benjamin C. Pierce, and Philip Wadler. Feather-weight Java: a minimal core calculus for Java and GJ. ACM Trans.Program. Lang. Syst., 23(3):396–450, 2001.

[Jac01] Bart Jacobs. A formalisation of Java’s exception mechanism. In ESOP’01: Proceedings of the 10th European Symposium on ProgrammingLanguages and Systems, pages 284–301. Springer-Verlag, 2001.

[JCYC04] Jangwoo Jo, Byeong-Mo Chang, Kwangkeun Yi, and Kwang-MooChoe. An uncaught exception analysis for Java. Journal of Systemsand Software, 72(1):59–69, 2004.

[Jet07] JetBrains. Intellij idea, 2000-2007.

[Joh86] Ralph E. Johnson. Type-checking smalltalk. In OOPSLA ’86: Confer-ence proceedings on Object-oriented programming systems, languagesand applications, pages 315–321, New York, NY, USA, 1986. ACMPress.

[JRH+99] Simon Peyton Jones, Alastair Reid, Fergus Henderson, Tony Hoare,and Simon Marlow. A semantics for imprecise exceptions. In PLDI’99: Proceedings of the ACM SIGPLAN 1999 conference on Program-ming language design and implementation, pages 25–36, New York,NY, USA, 1999. ACM Press.

[KHH+01] Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, JeffreyPalm, and William G. Griswold. An overview of aspectj. In ECOOP,pages 327–353, 2001.

[KHM04] James Leslie Keedy, Christian Heinlein, and Gisela Menger. Inheritingmultiple and repeated parts in Timor. Journal of Object Technology,3(10):99–120, 2004.

BIBLIOGRAPHY 163

[KLM+97] Gregor Kiczales, John Lamping, Anurag Menhdhekar, Chris Maeda,Cristina Lopes, Jean-Marc Loingtier, and John Irwin. Aspect-orientedprogramming. In Mehmet Aksit and Satoshi Matsuoka, editors, Pro-ceedings European Conference on Object-Oriented Programming, vol-ume 1241, pages 220–242. Springer-Verlag, Berlin, Heidelberg, andNew York, 1997.

[KM90] Tim Korson and John D. McGregor. Understanding object-oriented:a unifying paradigm. Commun. ACM, 33(9):40–60, 1990.

[KMH02] J. Leslie Keedy, Gisela Menger, and Christian Heinlein. Support forsubtyping and code re-use in timor. In CRPITS ’02: Proceedings ofthe Fortieth International Conference on Tools Pacific, pages 35–43,Darlinghurst, Australia, Australia, 2002. Australian Computer Soci-ety, Inc.

[Kni] Gunter Kniesel. Delegation for java: Api or language extension?

[Kni99a] Gunter Kniesel. Type-safe delegation for run-time componen adap-tation. In Rachid Guerraoui, editor, ECOOP ’99—Object-OrientedProgramming, volume 1628 of Lecture Notes in Computer Science,pages 351–366. Springer, jun 1999.

[Kni99b] Gunter Kniesel. Type-safe delegation for run-time component adap-tation. In ECOOP ’99: Proceedings of the 13th European Conferenceon Object-Oriented Programming, pages 351–366, London, UK, 1999.Springer-Verlag.

[Knu88] J. Lindskov Knudsen. Name collision in multiple classification hier-archies. In on ECOOP ’88 (European Conference on Object-OrientedProgramming), pages 93–109, London, UK, 1988. Springer-Verlag.

[Lae06] Tim Laeremans. Adding the object-oriented programming languageC# to Chameleon. Master’s thesis, K.U.Leuven, 2006.

[Lam93] John Lamping. Typing the specialization interface. In Proceedings ofthe eighth annual conference on Object-oriented programming systems,languages, and applications, pages 201–214. ACM Press, 1993.

[LBR00] Gary T. Leavens, Albert L. Baker, and Clyde Ruby. Preliminarydesign of JML: A behavioral interface specification language for Java.Technical Report 98-06i, 2000.

[Lei98] K. Rustan M. Leino. Data groups: specifying the modification ofextended state. In OOPSLA ’98: Proceedings of the 13th ACM

164 BIBLIOGRAPHY

SIGPLAN conference on Object-oriented programming, systems, lan-guages, and applications, pages 144–153, New York, NY, USA, 1998.ACM Press.

[LHR88] K. Lieberherr, I. Holland, and A. Riel. Object-oriented programming:an objective sense of style. In OOPSLA ’88: Conference proceedingson Object-oriented programming systems, languages and applications,pages 323–334, New York, NY, USA, 1988. ACM Press.

[Lis86] Barbara Liskov. Abstraction and specification in program develop-ment. MIT Press, 1986.

[LL99] Martin Lippert and Cristina Lopes. A study on exception detectionand handling using aspect-oriented programming. Technical report,Xerox PARC, 1999.

[LM96] Marc Van Limberghen and Tom Mens. Encapsulation and composi-tion as orthogonal operators on mixins: a solution to multiple inheri-tance problems. Object Oriented Systems, 3:1–30, 1996.

[LPHZ02] K. Rustan M. Leino, Arnd Poetzsch-Heffter, and Yunhong Zhou. Us-ing data groups to specify and check side effects. In PLDI ’02: Pro-ceedings of the ACM SIGPLAN 2002 Conference on Programminglanguage design and implementation, pages 246–257, New York, NY,USA, 2002. ACM Press.

[LS04] K. Rustan M. Leino and Wolfram Schulte. Exception safety for C#.In J. Cuellar and Z. Liu, editors, Proceedings, Software Engineeringand Formal Methods (SEFM), Beijing, China. IEEE Press, 2004.

[Luy06] Jonathan Luyckx. Programming by contract: static verification ofobject-oriented programs. Master’s thesis, K.U.Leuven, 2006.

[LW93a] B. Liskov and J. Wing. Family values: A behavioral notion of sub-typing. Technical Report MIT/LCS/TR-562b, 1993.

[LW93b] Barbara Liskov and Jeannette M. Wing. A new definition of thesubtype relation. In ECOOP ’93: Proceedings of the 7th EuropeanConference on Object-Oriented Programming, pages 118–141, London,UK, 1993. Springer-Verlag.

[LZ74] Barbara Liskov and Stephen Zilles. Programming with abstract datatypes. In Proceedings of the ACM SIGPLAN symposium on Very highlevel languages, pages 50–59, New York, NY, USA, 1974. ACM Press.

[Mey97] Bertrand Meyer. Object-oriented software construction (2nd ed.).Prentice-Hall, Inc., 1997.

BIBLIOGRAPHY 165

[Mey01] Bertrand Meyer. Overloading vs. object technology. Journal ofObject-Oriented Programming, October 2001.

[MMMP90] Ole Lehrmann Madsen, Boris Magnusson, and Birger Mølier Peder-sen. Strong typing of object-oriented languages revisited. In OOP-SLA/ECOOP ’90: Proceedings of the European conference on object-oriented programming on Object-oriented programming systems, lan-guages, and applications, pages 140–150, New York, NY, USA, 1990.ACM Press.

[MMP89] O. L. Madsen and B. Moller-Pedersen. Virtual classes: a powerfulmechanism in object-oriented programming. In OOPSLA ’89: Confer-ence proceedings on Object-oriented programming systems, languagesand applications, pages 397–406, New York, NY, USA, 1989. ACMPress.

[MN06] S. McPeak and G.C. Necula. Elkhound: A fast, practical glr parsergenerator. In Proc. 13th International Conference on Compiler Con-struction, pages 73–88. Springer, 2006.

[MO] Sean McDirmid and Martin Odersky. The Scala plugin for Eclipse.In Eclipse Technology Exchange 2006.

[MO02] Mira Mezini and Klaus Ostermann. Integrating independent compo-nents with on-demand remodularization. In Proceedings of OOPSLA’02, Sigplan Notices, 37 (11), pages 52–67, 2002.

[MR01] Anna Mikhailova and Alexander Romanovsky. Supporting evolutionof interface exceptions. In Advances in exception handling techniques,pages 94–110. Springer-Verlag New York, Inc., 2001.

[MT97] Robert Miller and Anand Tripathi. Issues with exception handlingin object-oriented systems. In Proceedings of the European Confer-ence on Object-Oriented Programming (ECOOP ’97), volume 1241 ofLNCS, page 85, Jyvaskyla, Finland, June 1997. Springer.

[NCM03] N. Nystrom, M. Clarkson, and A. Myers. Polyglot: An extensiblecompiler framework for java, 2003.

[NCM04] Nathaniel Nystrom, Stephen Chong, and Andrew C. Myers. Scalableextensibility via nested inheritance. In OOPSLA ’04: Proceedings ofthe 19th annual ACM SIGPLAN Conference on Object-oriented pro-gramming, systems, languages, and applications, pages 99–115, NewYork, NY, USA, 2004. ACM Press.

166 BIBLIOGRAPHY

[OH92] Harold Ossher and William Harrison. Combination of inheritance hi-erarchies. In OOPSLA ’92: conference proceedings on Object-orientedprogramming systems, languages, and applications, pages 25–40, NewYork, NY, USA, 1992. ACM Press.

[OMG06] OMG. Meta Object Facility (MOF) Core Specification 2.0, 2006.

[Ost02] Klaus Ostermann. Dynamically composable collaborations with del-egation layers. In ECOOP ’02: Proceedings of the 16th EuropeanConference on Object-Oriented Programming, pages 89–110, London,UK, 2002. Springer-Verlag.

[oTC05] Technical Group 4 of Technical Committee 39. ECMA-367 Standard:Eiffel Analysis, Design and Programming Language. ECMA Interna-tional, 2005.

[OZ05] Martin Odersky and Matthias Zenger. Scalable component abstrac-tions. In OOPSLA ’05: Proceedings of the 20th annual ACM SIG-PLAN conference on Object oriented programming systems languagesand applications, pages 41–57, New York, NY, USA, 2005. ACM Press.

[Par02] David L. Parnas. On the criteria to be used in decomposing sys-tems into modules. In Software pioneers: contributions to softwareengineering, pages 411–427. Springer-Verlag New York, Inc., 2002.

[Pie02] Benjamin C. Pierce. Types and programming languages. MIT Press,Cambridge, MA, USA, 2002.

[PL99] Francois Pessaux and Xavier Leroy. Type-based analysis of uncaughtexceptions. In Symposium on Principles of Programming Languages,pages 276–290, 1999.

[PN06] David J. Pearce and James Noble. Relationship aspects. In AOSD’06: Proceedings of the 5th international conference on Aspect-orientedsoftware development, pages 75–86, New York, NY, USA, 2006. ACMPress.

[PQ95] T. Parr and R. Quong. Antlr: A predicatedll (k) parser generator,1995.

[PS94] Jens Palsberg and Michael I. Schwartzbach. Static typing for object-oriented programming. Sci. Comput. Program., 23(1):19–53, 1994.

[PST91] Ben Potter, Jane Sinclair, and David Till. An introduction to formalspecification and Z. Prentice-Hall, Inc., 1991.

BIBLIOGRAPHY 167

[RFW96] Jonathan G. Rossie Jr., Daniel P. Friedman, and Mitchell Wand.Modeling subobject-based inheritance. Lecture Notes in ComputerScience, 1098:248–??, 1996.

[RM99] Martin P. Robillard and Gail C. Murphy. Analyzing exception flowin Java programs. In Software Engineering – ESEC/FSE’99, volume1687 of Lecture Notes in Computer Science, pages 322–337. Springer-Verlag, September 1999.

[RM00] Martin P. Robillard and Gail C. Murphy. Designing robust Java pro-grams with exceptions. In Proceedings of the 8th ACM SIGSOFT in-ternational symposium on Foundations of software engineering, pages2–10. ACM Press, 2000.

[RM03] Martin P. Robillard and Gail C. Murphy. Static analysis to supportthe evolution of exception structure in object-oriented systems. ACMTrans. Softw. Eng. Methodol., 12(2):191–221, 2003.

[Ros88] E. Rosch. Principles of categorization. In A. Collins and E. E. Smith,editors, Readings in Cognitive Science: A Perspective from Psychologyand Artificial Intelligence, pages 312–322. Kaufmann, San Mateo, CA,1988.

[RS01] Alexander Romanovsky and Bo Sanden. Except for exception han-dling . . . . Ada Lett., XXI(3):19–25, 2001.

[RSK+00] Barbara G. Ryder, Donald Smith, Ulrich Kremer, Michael Gordon,and Nirav Shah. A static study of Java exceptions using JESP. InComputational Complexity, pages 67–81, 2000.

[RT06] John Reppy and Aaron Turon. A foundation for trait-based metapro-gramming, 2006. International Workshops on Foundations of Object-Oriented Languages.

[Rum87] James Rumbaugh. Relations as semantic constructs in an object-oriented language. In OOPSLA ’87: Conference proceedings onObject-oriented programming systems, languages and applications,pages 466–481, New York, NY, USA, 1987. ACM Press.

[RY01] S. Ryu and K. Yi. Exception analysis for multithreaded Java pro-grams. In APAQS ’01: Proceedings of the Second Asia-Pacific Con-ference on Quality Software, page 23, Washington, DC, USA, 2001.IEEE Computer Society.

[Sak88] M. Sakkinen. On the darker side of C++. In on ECOOP ’88 (Eu-ropean Conference on Object-Oriented Programming), pages 162–176,London, UK, 1988. Springer-Verlag.

168 BIBLIOGRAPHY

[Sak89] Markku Sakkinen. Disciplined inheritance. In ECOOP, pages 39–56,1989.

[Sam92] Jean Sammet. Farewell to grace hopper, end of an era! Commun.ACM, 35(4):128–131, 1992.

[SB93] Carl F. Schaefer and Gary N. Bundy. Static analysis of exceptionhandling in Ada. Softw., Pract. Exper., 23(10):1157–1174, 1993.

[SB98] Yannis Smaragdakis and Don S. Batory. Implementing layered designswith mixin layers. In ECOOP ’98: Proceedings of the 12th EuropeanConference on Object-Oriented Programming, pages 550–570, London,UK, 1998. Springer-Verlag.

[SBD04] Nathanael Scharli, Andrew P. Black, and Stephane Ducasse. Object-oriented encapsulation for dynamically typed languages. In OOPSLA’04: Proceedings of the 19th annual ACM SIGPLAN Conference onObject-oriented programming, systems, languages, and applications,pages 130–149, New York, NY, USA, 2004. ACM Press.

[SC00] Joao Costa Seco and Luıs Caires. A basic model of typed compo-nents. In ECOOP ’00: Proceedings of the 14th European Conferenceon Object-Oriented Programming, pages 108–128, London, UK, 2000.Springer-Verlag.

[SCB+86] Craig Schaffert, Topher Cooper, Bruce Bullis, Mike Kilian, and CarrieWilpolt. An introduction to Trellis/Owl. SIGPLAN Not., 21(11):9–16, 1986.

[Sch06] Koen Schepers. Van test driven development naar specification drivendevelopment: praktisch bekeken. Master’s thesis, K.U.Leuven, 2006.

[SDN02] Nathanael Scharli, Stephane Ducasse, and Oscar Nierstrasz. Classes= traits + states + glue (beyond mixins and multiple inheritance).In Proceedings of the International Workshop on Inheritance, 2002.

[SDNB03] Nathanael Scharli, Stephane Ducasse, Oscar Nierstrasz, and AndrewBlack. Traits: Composable units of behavior. In Proceedings ECOOP2003 (European Conference on Object-Oriented Programming), vol-ume 2743 of LNCS, pages 248–274. Springer Verlag, July 2003.

[SG95] Raymie Stata and John V. Guttag. Modular reasoning in the pres-ence of subclassing. In OOPSLA ’95: Proceedings of the tenth annualconference on Object-oriented programming systems, languages, andapplications, pages 200–214, New York, NY, USA, 1995. ACM Press.

BIBLIOGRAPHY 169

[SG99] Peter F. Sweeney and Joseph (Yossi) Gil. Space and time-efficientmemory layout for multiple inheritance. In OOPSLA ’99: Proceed-ings of the 14th ACM SIGPLAN conference on Object-oriented pro-gramming, systems, languages, and applications, pages 256–275, NewYork, NY, USA, 1999. ACM Press.

[SH98] S. Sinha and M. J. Harrold. Analysis of programs with exception-handling constructs. In ICSM ’98: Proceedings of the InternationalConference on Software Maintenance, page 348, Washington, DC,USA, 1998. IEEE Computer Society.

[SLMD96] Patrick Steyaert, Carine Lucas, Kim Mens, and Theo D’Hondt. Reusecontracts: managing the evolution of reusable assets. In Proceed-ings of the 11th ACM SIGPLAN conference on Object-oriented pro-gramming, systems, languages, and applications, pages 268–285. ACMPress, 1996.

[SND+02] Nathanael Scharli, Oscar Nierstrasz, Stephane Ducasse, Roel Wuyts,and Andrew Black. Traits: The formal model. Technical ReportIAM-02-006, Institut fur Informatik, Universitat Bern, Switzerland,November 2002. Also available as Technical Report CSE-02-013, OGISchool of Science & Engineering, Beaverton, Oregon, USA.

[Sny86] Alan Snyder. Encapsulation and inheritance in object-oriented pro-gramming languages. In OOPSLA ’86: Conference proceedings onObject-oriented programming systems, languages and applications,pages 38–45, New York, NY, USA, 1986. ACM Press.

[SOM93] Clemens Szypersky, Stephen Omohundro, and Stephan Murer. Engi-neering a programming language: The type and class system of Sather.Technical Report TR-93-064, Berkeley, CA, 1993.

[Str91] Bjarne Stroustrup. The C++ programming language (2nd ed.).Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA,1991.

[SV04] Jan Smans and Guy Verachtert. Programming by contract: static ver-ification of java programs via backward substitution. Master’s thesis,K.U.Leuven, 2004.

[SVD96] S. Sankar, S. Viswanadha, and R. Duncan. The java compiler compiler(javacc): The java parser generator, 1996.

[Szy92] Clemens A. Szyperski. Import is not inheritance - why we need both:Modules and classes. In ECOOP ’92: Proceedings of the EuropeanConference on Object-Oriented Programming, pages 19–32, London,UK, 1992. Springer-Verlag.

170 BIBLIOGRAPHY

[Szy96] Clemens Szyperski. Independently extensible systems – software engi-neering potential and challenges. In Proceedings of the 19th AustralianComputer Science Conference, Melbourne, Australia, 1996.

[Tai96] Antero Taivalsaari. On the notion of inheritance. ACM Comput.Surv., 28(3):438–479, 1996.

[THA05] Sam Tobin-Hochstadt and Eric Allen. A core calculus of metaclasses,2005. International Workshops on Foundations of Object-OrientedLanguages.

[van95] Guido van Rossum. Python reference manual. Report CS-R9525,April 1995.

[Van06] Dries Vanoverberghe. Language features for implementing cross-cutting concerns in object-oriented languages. Master’s thesis,K.U.Leuven, 2006.

[Van07] Yves Vandewoude. Dynamically Updating Component-oriented Sys-tems. PhD thesis, K.U.Leuven, 2007.

[VB01] Jan Vitek and Boris Bokowski. Confined types in java. Softw. Pract.Exper., 31(6):507–532, 2001.

[vD06] Marko van Dooren. Jnome, 2006. http://www.jnome.org.

[vDS05a] Marko van Dooren and Eric Steegmans. Combining the robustness ofchecked exceptions with the flexibility of unchecked exceptions usinganchored exception declarations. In OOPSLA ’05: Proceedings of the20th annual ACM SIGPLAN conference on Object oriented program-ming systems languages and applications, pages 455–471, New York,NY, USA, 2005. ACM Press.

[vDS05b] Marko van Dooren and Eric Steegmans. Combining the robustnessof checked exceptions with the flexibility of unchecked exceptionsusing anchored exception declarations. Technical Report CW 407,Katholieke Universiteit Leuven, March 2005.

[vDS05c] Marko van Dooren and Eric Steegmans. Language constructs forimproving reusability in object-oriented software. In OOPSLA ’05:Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, pages118–119, New York, NY, USA, 2005. ACM Press.

[vDS06] Marko van Dooren and Eric Steegmans. Abstract data type compo-nents. Technical Report CW 439, K.U.Leuven, March 2006.

BIBLIOGRAPHY 171

[Vis01] Joost Visser. Visitor combination and traversal control. In OOPSLA’01: Proceedings of the 16th ACM SIGPLAN conference on Object ori-ented programming, systems, languages, and applications, pages 270–282, New York, NY, USA, 2001. ACM Press.

[VRUB03] Yves Vandewoude, Peter Rigole, David Urting, and Yolande Berbers.Draco : An adaptive runtime environment for components. ReportCW 372, Department of Computer Science, K.U.Leuven, Leuven, Bel-gium, 2003.

[WF94a] Christopher A. Welty and David A. Ferrucci. What’s in an instance?Technical report, Rochester Polytechnic Institute Computer ScienceDept., 1994.

[WF94b] Andrew K. Wright and Matthias Felleisen. A syntactic approach totype soundness. Inf. Comput., 115(1):38–94, 1994.

[Win03] Wayne Winston. Operations Research: Applications and Algorithms.Duxbury, 2003.

[WZ88] P. Wegner and S. B. Zdonik. Inheritance as an incremental modi-fication mechanism or what Like is and isn’t Like. In on ECOOP’88 (European Conference on Object-Oriented Programming), pages55–77, London, UK, 1988. Springer-Verlag.

[YB85] Shaula Yemini and Daniel M. Berry. A modular verifiable exceptionhandling mechanism. ACM Trans. Program. Lang. Syst., 7(2):214–243, 1985.

[YB87] Shaula Yemini and Daniel M. Berry. An axiomatic treatment of ex-ception handling in an expression-oriented language. ACM Trans.Program. Lang. Syst., 9(3):390–407, 1987.

[Yi94] Kwangkeun Yi. Compile-time detection of uncaught exceptions instandard ML programs. In The 1st International Static AnalysisSymposium, volume 864 of Lecture Notes in Computer Science, pages238–254, Namur, September 1994.

[Yi98] Kwangkeun Yi. An abstract interpretation for estimating uncaughtexceptions in standard ML programs. Science of Computer Program-ming, 31(1):147–173, 1998. (invited paper).

[YR97] Kwangkeun Yi and Sukyoung Ryu. Towards a cost-effective esti-mation of uncaught exceptions in SML programs. In Proceedings ofthe Annual International Static Analysis Symposium, volume 1302of Lecture Notes in Computer Science, pages 98–113, Paris, France,September 1997.

172 BIBLIOGRAPHY

[ZO01a] M. Zenger and M. Odersky. Implementing extensible compilers, 2001.

[ZO01b] Matthias Zenger and Martin Odersky. Extensible algebraic datatypeswith defaults. In Proceedings of the International Conference on Func-tional Programming, Firenze, Italy, September 2001.

List of Publications

International Conference Papers

[1] S. D. Labey, M. van Dooren, and E. Steegmans. ServiceJ: A Java extensionfor programming web service interactions. In ICWS 2007: Proceedings of the5th IEEE International Conference on Web Services, pages ?–?, 2007.

[2] M. van Dooren and E. Steegmans. Combining the robustness of checked ex-ceptions with the flexibility of unchecked exceptions using anchored exceptiondeclarations. In OOPSLA ’05: Proceedings of the 20th annual ACM SIGPLANconference on Object-Oriented Programming Systems Languages and Applica-tions, pages 455–471, New York, NY, USA, 2005. ACM Press.

[3] M. van Dooren and E. Steegmans. A higher abstraction level using first-classinheritance relations. In ECOOP 2007: Proceedings of the 21st European Con-ference on Object-Oriented Programming, pages ?–?, Berlin, Germany, 2007.Springer.

International Workshop Papers

[1] S. De Labey, M. van Dooren, and E. Steegmans. Bridging the Gap BetweenObject-Oriented Programming and Service Oriented Computing. In Proceed-ings of the 1st International Workshop on Foundations of Service OrientedArchitecture, 2007.

International Extended Abstracts

[1] M. van Dooren and E. Steegmans. Language constructs for improving reusabil-ity in object-oriented software. In OOPSLA ’05: Companion to the 20th an-nual ACM SIGPLAN conference on Object-oriented programming, systems,languages, and applications, pages 118–119, New York, NY, USA, 2005. ACMPress.

173

174 TECHNICAL REPORTS

National Journal Papers

[1] J. Boydens, M. van Dooren, and E. Steegmans. XP: eXtreme Programming, hetsoftwareontwikkelproces XP in .Net-projecten. .NET Magazine for developers,(9):80–82, June 2005.

National Conference Papers

[1] E. Steegmans, P. Bekaert, F. Devos, G. Delanote, N. Smeets, M. van Dooren,and J. Boydens. Black & White Testing: Bridging Black Box Testing andWhite Box Testing. In Software Testing: Beheers Optimaal de Risico’s van ITin uw Business, pages 1–12. ps testware, Leuven, 2004.

Technical Reports

[1] S. De Labey, M. van Dooren, and E. Steegmans. ServiceJ: Service-OrientedProgramming in Java. Report CW 451, K.U.Leuven, Department of ComputerScience, June 2006.

[2] J. Dockx, M. van Dooren, and E. Steegmans. Different implementations in Javaof a nested loop and their proofs. Report CW 324, Department of ComputerScience, K.U.Leuven, Leuven, Belgium, Dec. 2001.

[3] J. Dockx, M. van Dooren, and E. Steegmans. Dijkstra’s Dream: Internal Itera-tors as Software Theorems. Report CW 340, Department of Computer Science,K.U.Leuven, Leuven, Belgium, June 2002.

[4] J. Dockx, M. van Dooren, and E. Steegmans. Jutil.org. Report CW 342,Department of Computer Science, K.U.Leuven, Leuven, Belgium, June 2002.

[5] K. Mertens, N. Smeets, M. van Dooren, J. Dockx, and E. Steegmans. A NewSemantics for JML Signals Clauses. Report CW 343, Department of ComputerScience, K.U.Leuven, Leuven, Belgium, June 2002.

[6] M. van Dooren and E. Steegmans. Combining the robustness of checked ex-ceptions with the flexibility of unchecked exceptions using anchored exceptiondeclarations. Report CW 407, Department of Computer Science, K.U.Leuven,Leuven, Belgium, Mar. 2005.

[7] M. van Dooren and E. Steegmans. Abstract data type components. ReportCW 439, Department of Computer Science, K.U.Leuven, Leuven, Belgium,Mar. 2006.

Part V

Appendices

175

Appendix A

Proof of Compile-time

Safety

This section contains the compile-time safety proof of anchored exception declara-tions. For the proof, we limit expressions to this, references to formal parametersand class variables, and method invocations. Additionally, type names may beused as expressions in method expressions.

A.1 Notation

In addition to the formal notation presented in Section 2.5.1, we need some extranotation for the proof.

An actual argument that is used for substitution is represented by a pair con-taining the value as the first element, and the corresponding formal parameter asthe second element. actual = (val, par)

For the substitution of parameters in other parameters that are to be substi-tuted, we write: Ω((val, par), pre, args) = (Ω(val, pre, args), par)Ω((v1, p1) . . . (vn, pn), pre, args) =

(Ω(v1, pre, args), p1) . . . (Ω(vn, pre, args), pn)

A.2 Extension to the relation

For arguments that are to be substituted, we extend the definition of the relation.

(vala, parama) (valb, paramb) ⇔vala valb ∧ parama paramb

177

178 Proof of Compile-time Safety

(va,1, pa,1) . . . (va,n, pa,n) (vb,1, pb,1) . . . (vb,n, pb,n) ⇔(va,1, pa,1) (vb,1, pb,1) ∧ . . . ∧ (va,n, pa,n) (vb,n, pb,n)

A.3 Sets of types

We will need the following Lemma for sets of types. The proof is analogous to theproof for mathematical sets.

Lemma A.3.1

(Pa − Ba) ⊑ (Pb − Bb) ∧ (Pc − Bc) ⊑ (Pd − Bd)⇓

((Pa ⊓ Pc) − (Ba ⊔ Bc)) ⊑ ((Pb ⊓ Pd) − (Bb ⊔ Bd))

Proof A.3.1.

(Pa − Ba) ⊑ (Pb − Bb) ∧ (Pc − Bc) ⊑ (Pd − Bd)m (definitions of ⊑ and −)

∀x : ((x E Pa ∧ x 6E Ba) ⇒ (x E Pb ∧ x 6E Bb)) ∧((x E Pc ∧ x 6E Bc) ⇒ (x E Pd ∧ x 6E Bd))

⇓∀x : (x E Pa ∧ x 6E Ba ∧ x E Pc ∧ x 6E Bc) ⇒

(x E Pb ∧ x 6E Bb ∧ x E Pd ∧ x 6E Bd)m (definitions of ⊓ and ⊔)

∀x : (x E (Pa ⊓ Pc) ∧ x 6E (Ba ⊔ Bc)) ⇒ (x E (Pb ⊓ Pd) ∧ x 6E (Bc ⊔ Bd))m (definitions of ⊑ and −)

((Pa ⊓ Pc) − (Ba ⊔ Bc)) ⊑ ((Pb ⊓ Pd) − (Bb ⊔ Bd))

A.4 Properties of Φ and Ω

In this section we prove some properties about the Φ and Ω functions. Specifically,we will prove that under certain conditions f g is equivalent to g f , possiblyafter modifying the arguments.

The first lemma states that Φ and Ω may always be swapped when the ar-guments of Ω are valid. The function this(x) returns the implicit parameter thisthat is in the scope of the program element x.

Lemma A.4.1

okΩ(args, (pre, this(EC))⇓

Φ(Ω(EC, pre, args), P, B) = Ω(Φ(EC, P, B), pre, args)

A.4 Properties of Φ and Ω 179

Proof A.4.1. Since an exception clause is a list of exception declarations, and theΦ and Ω functions respectively apply Φ and Ω to the exception declarations, itsuffices to prove that:

Φ(Ω(ED, pre, args), P, B) = Ω(Φ(ED, P, B), pre, args)

1. (Px, Bx)

Φ(Ω((Px, Bx), pre, args), P, B) = Ω(Φ((Px, Bx), P, B), pre, args)m (definition of Ω and Φ)

Φ((Px, Bx), P, B) = Ω((Px ⊓ P, Bx ⊔ B), pre, args)m (definition of Ω)

Φ((Px, Bx), P, B) = (Px ⊓ P, Bx ⊔ B)m (definition of Φ)

true

2. like tx.mx(argsx) E Px 6E Bx

Φ(Ω(like tx.mx(argsx) E Px 6E Bx, pre, args), P, B) =Ω(Φ(like tx.mx(argsx) E Px 6E Bx, P, B), pre, args)

m (definition of Ω and Φ)Φ(like Ω(tx.mx(argsx), pre, args) E Px 6E Bx, P, B) =Ω(like tx.mx(argsx) E (Px ⊓ P ) 6E (Bx ⊔ B), pre, args)

m (definition of Ω and Φ)like Ω(tx.mx(argsx), pre, args) E (Pa ⊓ P ) 6E (Ba ⊔ B) =like Ω(tx.mx(argsx), pre, args) E (Pa ⊓ P ) 6E (Ba ⊔ B)

The second lemma states that if you perform two consecutive substitutions onan expression, that is equivalent to performing the last substitution on that actualarguments of the first substitution, and then applying the first substitution.

Lemma A.4.2

okΩ(argsa, (prea, this(expr))) ∧ okΩ(argsb, (preb, this(prea))))∀formal ∈ expr : formal ∈ argsa

⇓Ω(Ω(expr, prea, argsa), preb, argsb) =

Ω(expr, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

Proof A.4.2.

1. thisΩ(Ω(this, prea, argsa), preb, argsb) =

Ω(this, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))m (definition of Ω)

Ω(prea, preb, argsb) = Ω(prea, preb, argsb)


2. typeName

Ω(Ω(typeName, prea, argsa), preb, argsb) =Ω(typeName, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω(typeName, preb, argsb) = typeName

m (definition of Ω)typeName = typeName

3. formal: because of the precondition, formal = pari for exactly one(vali, pari) in argsa.

Ω(Ω(formal, prea, argsa), preb, argsb) =Ω(formal, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω(vali, preb, argsb) = Ω(vali, preb, argsb)

4. new C(a1, . . . , an)

Ω(Ω(new C(a1, . . . , an), prea, argsa), preb, argsb) =Ω(new C(a1, . . . , an), Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω(new C(Ω(a1, prea, argsa), . . . , Ω(an, prea, argsa)), preb, argsb) =

new C(Ω(a1, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)), . . . ,Ω(an, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)))

m (definition of Ω)new C(Ω(Ω(a1, prea, argsa), preb, argsb), . . . ,

Ω(Ω(an, prea, argsa), preb, argsb)) =new C(Ω(a1, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)), . . . ,

Ω(an, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)))m (induction on finite expression tree)

true

A.4 Properties of Φ and Ω 181

5. t.var

Ω(Ω((t.var, envvar), prea, argsa), preb, argsb) =Ω((t.var, envvar), Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω((Ω(t, prea, argsa).var, envvar), preb, argsb) =

(Ω(t, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)).var, envvar)m (definition of Ω)

(Ω(Ω(t, prea, argsa), preb, argsb).var, envvar) =(Ω(t, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)).var, envvar)

m (definition of Ω)Ω(Ω(t, prea, argsa), preb, argsb) =

Ω(t, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))m (induction on finite expression tree)

true

6. t.m(a1, . . . , an)

Ω(Ω(t.m(a1, . . . , an), prea, argsa), preb, argsb) =Ω(t.m(a1, . . . , an), Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω(Ω(t, prea, argsa).m(Ω(a1, prea, argsa), . . . ,

Ω(an, prea, argsa)), preb, argsb) =Ω(t, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)).m(

Ω(a1, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)), . . . ,Ω(an, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)))

m (definition of Ω)Ω(Ω(t, prea, argsa), preb, argsb).m(

Ω(Ω(a1, prea, argsa), preb, argsb), . . . ,Ω(Ω(an, prea, argsa), preb, argsb)) =

Ω(t, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)).m(Ω(a1, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)), . . . ,

Ω(an, Ω(prea, preb, argsb), Ω(argsa, preb, argsb)))m (induction on finite expression tree)

true

The same property holds for applying two consecutive substitutions on anexception clause.

Lemma A.4.3

okΩ(argsa, (prea, this(EC))) ∧ okΩ(argsb, (preb, this(prea))))∀formal ∈ EC : formal ∈ argsa

⇓Ω(Ω(EC, prea, argsa), preb, argsb) =

Ω(EC, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))


Proof A.4.3.Since an exception clause is a list of exception declarations, and theΩ function applies Ω to the exception declarations, it suffices to prove that:

Ω(Ω(ED, prea, argsa), preb, argsb) =Ω(ED, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

1. (P, B)

Ω(Ω((P, B), prea, argsa), preb, argsb) =Ω((P, B), Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω((P, B), preb, argsb) = (P, B)

m (definition of Ω)(P, B) = (P, B)

2. like t.m(args) E P 6E B

Ω(Ω(like t.m(args) E P 6E B, prea, argsa), preb, argsb) =Ω(like t.m(args) E P 6E B, Ω(prea, preb, argsb), Ω(argsa, preb, argsb))

m (definition of Ω)Ω(like Ω(t.m(args), prea, argsa) E P 6E B, preb, argsb) =

like Ω(t.m(args), Ω(prea, preb, argsb), Ω(argsa, preb, argsb)) E P 6E Bm (definition of Ω)

like Ω(Ω(t.m(args), prea, argsa), preb, argsb) =E P 6E Blike Ω(t.m(args), Ω(prea, preb, argsb), Ω(argsa, preb, argsb)) E P 6E B

m (LemmaA.4.2)true

A.5 Properties of the ω Function

Filtering an exception declaration to only allow checked exceptions of type E topass has no effect on whether or not E is allowed or not.

A.5.1 Exception Declarations

Lemma A.5.1

ω(ED, E, trace) ⇔ ω(Φ(ED, E, ∅), E, trace)

Proof A.5.1.

A.5 Properties of the ω Function 183

1. (P, B)ω((P, B), E, trace) ⇔ ω(Φ((P, B), E, ∅), E, trace)

m (definition of ω)E E (P − B) ⇔ E E ((P ⊓ E) − B)

mtrue

2. anchor: proven by induction on the first derivation of Lemma A.5.2. Thisinduction will end in absolute exception declarations for which the proof isgiven in the first part of this lemma. As for the definition of the ω function,the trace will prevent the induction from getting stuck in an infinite loop.

ω(Φ(anchor, E, ∅), E, trace)m (definition of ω)

Γ(t).m(Γ((args))) 6∈ trace ⇒

ω(Υ(Φ(anchor, E, ∅)), E, Γ(t).m(Γ((args))) ∪ trace)m (Φ and Υ can be switched)

Γ(t).m(Γ((args))) 6∈ trace ⇒

ω(Φ(Υ(anchor), E, ∅), E, Γ(t).m(Γ((args))) ∪ trace)m (induction on Lemma A.5.2)

Γ(t).m(Γ((args))) 6∈ trace ⇒ω(Υ(anchor), E, Γ(t).m(Γ((args))) ∪ trace)

m (definition of ω)ω(anchor, E, trace)

A.5.2 Exception Clauses

The same property holds for exception clauses:

Lemma A.5.2

ω(EC, E, trace) ⇔ ω(Φ(EC, E, ∅), E, trace)

Proof A.5.2.

ω(Φ(EC, E, ∅), E, trace)m

ω(Φ(ED1, E, ∅), E, trace) ∨ . . . ∨ ω(Φ(EDn, E, ∅), E, trace)m (induction on first part of Lemma A.5.1)

ω(ED1, E, trace) ∨ . . . ∨ ω(EDn, E, trace)m ω(EC, E, trace)


A.6 Properties of the relation

Lemma A.6.1 If expra conforms to exprb, the type of expra conforms to the typeof exprb.

expra exprb ⇒ Γ(expra) <: Γ(exprb)

Proof A.6.1. For this, constructor invocations, and type names, the lemma di-rectly from the definition. For formal parameters, it follows from the definitionbecause we only allow invariant formal parameters. We now prove the fifth andthe sixth cases.

(5) t.var

targeta.vara targetb.varb ⇔ targeta targetb ∧ vara = varb

⇓Γ(targeta.vara) = Γ(targetb.varb)

(6) t.m(args)

Γ(ta.ma(a1, . . . , an)) <: Γ(tb.mb(b1, . . . , bn))m

returnType(method(ta.ma(a1, . . . , an))) <:returnType(method(tb.mb(b1, . . . , bn)))

m (covariant return types)method(ta.ma(a1, . . . , an)) <: method(tb.mb(b1, . . . , bn))m (dynamic binding and invariant argument types)Γ(ta) <: Γ(tb) ∧ Γ(a1) <: Γ(b1) ∧ . . . ∧ Γ(an) <: Γ(bn)

⇑ta tb ∧ a1 b1 ∧ . . . ∧ an bn

This last case is the induction step of the proof for method invocations.Because a Typeable is a finite tree and a method invocation always has atarget, as required by the assumptions, the other cases serve as base cases,which have been proven.

Lemma A.6.2 If anchored exception declaration anchora conforms to anchorb,then the method referenced by anchora will always be conform to the method ref-erenced by anchorb.

anchora anchorb ⇒ method(anchora) <: method(anchorb)

Proof A.6.2. Because of lemma A.6.1, the types of the target and the actual ar-guments of anchora will always be conform to the corresponding types of anchorb.

A.7 Overview of Dependencies 185

A.8.8

A.8.7

A.8.1 A.8.3 A.8.6

A.9.5

A.9.4

A.9.1 A.9.2 A.9.3

A.10.6

A.10.5

A.10.1 A.10.3 A.10.4

Figure A.1: Dependency graph for Theorems A.8.8, A.9.5, and A.10.6.

Consequently, because we do no allow syntactic overloading and require parame-ter types to be invariant, anchora will always reference a method conform to themethod referenced by anchorb.

A.7 Overview of Dependencies

This section gives an overview of the dependencies in the proofs of TheoremsA.8.8, A.9.5, and A.10.6, and explains why the inductions that are used in theirproofs will always end. This is also explained in the proofs themselves. This sectionmerely serves to clarify the reasoning.

Each arrow represents a dependency. The solid arrows represent dependenciesthat apply the target lemma or theorem directly to the current exception clauseor a part of it. The dotted arrows represent dependencies for which the targetlemma or theorem is applied after following an anchored exception declaration.In every lemma or theorem, the anchor that is followed is the one with index a,and it will always be handed to the next theorem or lemma with index a. Becauseno loop can be made in the dependency graph without using a dotted arrow, theinduction process follows a path in the expansion graph of an exception clause.Because we keep a trace and put guard conditions whenever an expansion is done,the induction will always end.


A.8 The relation is transitive

A.8.1 Absolute Exception Declarations

Lemma A.8.1

Φ((Pa, Ba), E, ∅) (Pb, Bb) ∧ Φ((Pb, Bb), E, ∅) (Pc, Bc)⇓

Φ((Pa, Ba), E, ∅) (Pc, Bc)

Proof A.8.1.Φ((Pa, Ba), E, ∅) (Pb, Bb)

⇓((Pa ⊓ E) − Ba) ⊑ (Pb − Bb)

⇓((Pa ⊓ E) − Ba) ⊑ ((Pb ⊓ E) − Bb)⇓ ((Pb ⊓ E) − Bb) ⊑ (Pc − Bc)((Pa ⊓ E) − Ba) ⊑ (Pc − Bc)

The transitivity of the ⊑ relation follows straightforward from its definition.

A.8.2 Method expressions

Lemma A.8.2expra exprb ∧ exprb exprc

⇓expra exprc

Proof A.8.2. From the definition of for expressions, it follows that the form ofexprc dictates the form of expra and exprb. Only a type name allows a and b tobe of a different form.

1. thisa, thisb, thisc: follows directly from the transitivity of the subtyping (<:)relation.

2. expra, exprb, typec: follows from Lemma A.6.1 and the transitivity of thesubtyping (<:) relation.

3. formala, formalb, formalc: follows directly from the definition.

4. new C(args): follows from the definition and induction on this lemma.

5. ta.vara, tb.varb, tc.varc: follows from the definition and induction on thislemma.

6. ta.m(argsa), tb.m(argsb), tc.m(argsc): follows from the definition and induc-tion on this lemma.

A.8 The relation is transitive 187

A.8.3 Anchored Exception Declarations

Directly conformance

Lemma A.8.3

Φ(anchora, E, ∅) anchorb ∧ Φ(anchorb, E, ∅) anchorc

⇓Φ(anchora, E, ∅) anchorc

Proof A.8.3. This lemma follows directly from Lemma A.8.2 which proves tran-sitivity for the condition on the method expressions, and the transitivity of the ⊑relation which proves transitivity for the filter clauses.

Both direct conformance and conformance after expansion

Lemma A.8.4

trace1 ⊢ ECa ECb ∧ trace1 ⊆ trace2

⇓trace2 ⊢ ECa ECb

Proof A.8.4. If trace2 is a superset of trace1, its analysis will return true whilethe analysis for trace1 might give false. The reverse can never happen.

Lemma A.8.5

trace ⊢ Φ(Υ(anchor), E, ∅) ECm

κ(anchor, EC) ∪ trace ⊢ Φ(Υ(anchor), E, ∅) EC

Proof A.8.5. Because the method referenced by the anchored exception decla-ration is analyzed anyway in both cases, it does not matter that in the relationwithout a trace, the analysis is done a second time (after which it is in the traceand will not be analyzed again).

Lemma A.8.6

Φ(anchora, E, ∅) anchorb∧κ(anchorb, ECc) ⊢ Φ(Υ(anchorb), E, ∅) ECc

⇓κ(anchora, ECc) 6∈ trace ⇒

κ(anchora, ECc) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECc

Proof A.8.6.Let AEDa = like ta.m(argsa) E Pa 6E Ba and AEDb =like tb.m(argsb) E Pb 6E Bb.


If κ(anchora, ECc) ∈ trace, the Lemma is trivially true. We now prove thelemma for the case κ(anchora, ECc) 6∈ trace.

Φ(anchora, E, ∅) anchorb

⇓ (Lemma A.9.2)Φ(anchora, E, ∅) Φ(anchorb, E, ∅)

⇓ (Lemma A.6.2 and rule 2)ε(Φ(anchora, E, ∅)) ε(Φ(anchorb, E, ∅))

⇓ (Lemma A.8.4)κ(anchora, ECc) ∪ trace ⊢ ε(Φ(anchora, E, ∅)) ε(Φ(anchorb, E, ∅))

⇓ (induction on Lemma A.9.4)κ(anchora, ECc) ∪ trace ⊢

Φ(ε(Φ(anchora, E, ∅)), Pa, Ba) Φ(ε(Φ(anchorb, E, ∅)), Pb, Bb)⇓ (induction on Lemma A.10.5)

κ(anchora, ECc) ∪ trace ⊢ Ω(Φ(ε(Φ(anchora, E, ∅)), Pa, Ba), ta, argsa) Ω(Φ(ε(Φ(anchorb, E, ∅)), Pb, Bb), tb, argsb)⇓ (definition of Υ and Lemma A.4.1)

κ(anchora, ECc) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) Φ(Υ(anchorb), E, ∅)

Applying Lemma A.8.5 to the second part of the precondition of this Lemma,we get Φ(Υ(anchorb), E, ∅) ECc. This complete the precondition for inductionon Lemma A.8.7. The induction to A.8.7 ends either in Lemma A.8.1 or A.8.3. Itcannot get stuck in an infinite loop because we add κ(anchora, ECc) to the trace,and we stop if anchora is in the trace.

κ(anchora, ECc) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECc

⇓ (assumption)κ(anchora, EC) 6∈ trace ⇒

κ(anchora, EC) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECc

The inductions on Lemmas A.9.4, A.10.5, and A.8.7 will end because theyall go back to Lemma A.8.7 after performing an expansion and keeping a trace.Therefore, every branch eventually ends up in Lemmas A.8.1 or A.8.3, or at thestopping condition of this lemma.


Lemma A.8.7

trace ⊢ ECa ECb ∧ ECb ECc ⇒ trace ⊢ ECa ECc

A.9 Φ is monotone 189

Proof A.8.7. We must prove that:

∀(Pa, Ba)∈ ECa, ∀E, ω((Pa, Ba), E) :∃(Pc, Bc)∈ ECc :

Φ((Pa, Ba), E, ∅) (Pc, Bc)

∧

∀anchora ∈ ECa, ∀E : ω(anchora, E) :(∃anchorc∈ ECc :

(Φ(anchora, E, ∅) anchorc∨κ(anchora, ECc) 6∈ trace ⇒

κ(anchora, ECc) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECc))

The case for absolute exception declarations is proven by Lemma A.8.1. For an-chored exception declarations of ECa that directly conform to anchored exceptiondeclaration of ECb, the proof is given by Lemmas A.8.3 and A.8.6. For the casewhere Υ(AEDa) ECb, we apply induction on this lemma.

Theorem A.8.8 The relation for exception clauses is transitive.

ECa ECb ∧ ECb ECc ⇒ ECa ECc

Proof A.8.8. The proof follows directly from Lemma A.8.7.

A.9 Φ is monotone

The Φ function maintains the order between two exception clauses or exceptiondeclarations when the same types are filtered from both. It also maintains theorder when the smaller clause or declaration is filtered with stronger arguments(allowing less types to be propagated and blocking more types).


Lemma A.9.1

(Pc − Bc) ⊑ (Pd − Bd) ∧ Φ((Pa, Ba), E, ∅) (Pb, Bb)m

Φ(Φ((Pa, Ba), Pc, Bc), E, ∅) Φ((Pb, Bb), Pd, Bd)


Proof A.9.1.

Φ(Φ((Pa, Ba), Pc, Bc), E, ∅) Φ((Pb, Bb), Pd, Bd)m (definition of Φ)

(Pa ⊓ (Pc ⊓ E), Ba ⊔ Bc) (Pb ⊓ Pd, Bb ⊔ Bd)m (definition of )

(((Pa ⊓ E) ⊓ Pc) − (Ba ⊔ Bc)) ⊑ ((Pb ⊓ Pd) − (Bb ⊔ Bd))⇑ (Lemma A.3.1)

(Pa − Ba) ⊑ (Pb − Bb) ∧ (Pc − Bc) ⊑ (Pd − Bd)m (definition of )

(Pa ⊓ E, Ba) (Pb, Bb) ∧ (Pc − Bc) ⊑ (Pd − Bd)


Direct compatibility

Lemma A.9.2

Φ(anchora, E, ∅) anchorb ∧ (Pc − Bc) ⊑ (Pd − Bd)⇓

Φ(Φ(anchora, Pc, Bc), E, ∅) Φ(anchorb, Pd, Bd)

Proof A.9.2.

like ta.ma(argsa) E (Pa ⊓ E) 6E Ba like tb.mb(argsb) E Pb 6E Bb) ∧(Pc − Bc) ⊑ (Pd − Bd)m (definition of )

ta.ma(argsa) tb.mb(argsb) ∧((Pa ⊓ E) − Ba) ⊑ (Pb − Bb) ∧ (Pc − Bc) ⊑ (Pd − Bd)

⇓ (lemma A.3.1)ta.ma(argsa) tb.mb(argsb) ∧

(((Pa ⊓ E) ⊓ Pc) − (Ba ⊔ Bc)) ⊑ ((Pb ⊓ Pd) − (Bb ⊔ Bd))m (definitions of and ⊔)

like ta.ma(argsa) E (Pa ⊓ E ⊓ Pc) 6E (Ba ⊔ E ⊔ Bc) like tb.mb(argsb) E (Pb ⊓ Pd) 6E (Bb ⊔ Bd)

m (definition of Φ)Φ(Φ(AEDa, Pc, Bc), E, ∅) Φ(AEDb, Pd, Bd)


Compatibility After Expansion

Lemma A.9.3

(Pa − Ba) ⊑ (Pb − Bb) ∧κ(anchora, ECb) 6∈ trace ⇒

κ(anchora, ECb) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECb

⇓

κ(anchora, ECb) 6∈ trace ⇒κ(anchora, ECb) ∪ trace ⊢

Φ(Υ(Φ(anchora, Pa, Ba)), E, ∅) Φ(ECb, Pb, Bb)

Proof A.9.3. We prove this using induction on Lemma A.9.4. We expand theanchored exception declaration one level and assume that Lemma A.9.4 holds forthe resulting exception clause and ECB . The exception clause resulting from theexpansion is the exception clause of the method referenced by anchor, or one ofits submethods, with context information inserted. Because we keep a trace, therecursion must end in methods of which the exception clauses contain no anchoredexception declarations, or if κ(anchora, ECb) ∈ trace.

If κ(anchora, ECb) ∈ trace, the lemma is trivially true. We now prove thelemma for κ(anchora, ECb) 6∈ trace.

The preconditions of Theorem A.9.4 follow directly from the preconditions ofthis lemma.

Φ(κ(anchora, ECb) ∪ trace ⊢ Υ(anchora), E, ∅) ECb

⇓ (induction on LemmaA.9.4)Φ(κ(anchora, ECb) ∪ trace ⊢

Φ(Φ(Υ(anchora), E, ∅), Pa, Ba) Φ(ECb, Pb, Bb)⇓ (induction on LemmaA.8.7)

Φ(κ(anchora, ECb) ∪ trace ⊢Φ(Υ(Φ(anchor, Pa, Ba)), E, ∅) Φ(Φ(Υ(anchor), E, ∅), Pa, Ba)

⇓Φ(κ(anchora, ECb) ∪ trace ⊢

Φ(Υ(Φ(anchor, Pa, Ba)), E, ∅) Φ(ECb, Pb, Bb)

As explained in the proof of Theorem A.8.7 the transitivity property of is indi-rectly based on this lemma. Because we keep add anchora to the trace, and stopif we receive it again, the induction must end. On this side, it will end in eitherLemma A.9.1 or A.9.2. Now we only need to prove the left-hand side of the lastimplication.

anchora = like t.m(args) E P 6E B⇓

Φ(anchora, Pa, Ba) = like t.m(args) E (P ⊓ Pa) 6E (B ⊔ Ba)


Φ(κ(anchora, ECb) ∪ trace ⊢Φ(Υ(Φ(anchora, Pa, Ba)), E, ∅) Φ(Φ(Υ(anchora), E, ∅), Pa, Ba)

m (definition of Υ and Φ)Φ(κ(anchora, ECb) ∪ trace ⊢

Φ(Ω(Φ(ε(Φ(anchor, Pa, Ba)), (P ⊓ Pa), (B ⊔ Ba)), t, args), E, ∅) Φ(Ω(Φ(ε(anchor), P, B), t, args), (Pa ⊓ E), Ba)

Because Φ does not alter the method expression, it does not have any effect onthe ε function.

mΦ(κ(anchora, ECb) ∪ trace ⊢

Φ(Ω(Φ(ε(anchor), (P ⊓ Pa), (B ⊔ Ba)), t, args), E, ∅) Φ(Ω(Φ(ε(anchor), P, B), t, args), (Pa ⊓ E), Ba)

m (Lemma A.4.1)Φ(κ(anchora, ECb) ∪ trace ⊢

Ω(Φ(Φ(ε(anchor), (P ⊓ Pa), (B ⊔ Ba)), E, ∅), t, args) Ω(Φ(Φ(ε(anchor), P, B), (Pa ⊓ E), Ba), t, args)

m (definitions of Φ,⊔ and ⊓)Φ(κ(anchora, ECb) ∪ trace ⊢

Ω(Φ(ε(anchor), (P ⊓ Pa ⊓ E), (B ⊔ Ba)), t, args) Ω(Φ(ε(anchor), (P ⊓ Pa ⊓ E), (B ⊔ Ba)), t, args)

m (definition of ⊢)true


Lemma A.9.4

(Pc − Bc) ⊑ (Pd − Bd) ∧ trace ⊢ ECa ECb

⇓trace ⊢ Φ(ECa, Pc, Bc) Φ(ECb, Pd, Bd)


Proof A.9.4.

trace ⊢ Φ(ECa, Pc, Bc) Φ(ECb, Pd, Bd)m (definition of )

∀Φ((Pa, Ba), Pc, Bc) ∈ Φ(ECa, Pc, Bc), ∀E, ω(Φ((Pa, Ba), Pc, Bc), E) :∃Φ((Pb, Bb), Pd, Bd) ∈ Φ(ECb, Pd, Bd) :

Φ(Φ((Pa, Ba), Pc, Bc), E, ∅) Φ((Pb, Bb), Pd, Bd)

∧

∀Φ(anchora, Pc, Bc) ∈ Φ(ECa, Pc, Bc),∀E :ω(Φ(anchora, Pc, Bc), E) :

(∃Φ(anchorb, Pd, Bd) ∈ Φ(ECb, Pd, Bd) :(Φ(Φ(anchora, Pc, Bc), E, ∅) Φ(anchorb, Pd, Bd) ∨κ(anchora, ECb) 6∈ trace ⇒

κ(anchora, ECb) ∪ trace ⊢Φ(Υ(Φ(anchora, Pc, Bc)), E, ∅) Φ(ECb, Pd, Bd)))

⇑ (trace ⊢ ECa ECb)

(Pc − Bc) ⊑ (Pd − Bd) ∧ Φ(ABSa, E, ∅) ABSb

⇓Φ(Φ(ABSa, Pc, Bc), E, ∅) Φ(ABSb, Pd, Bd)

∧

(Pc − Bc) ⊑ (Pd − Bd) ∧ Φ(anchora, E, ∅) anchorb

⇓Φ(Φ(anchora, Pc, Bc), E, ∅) Φ(anchorb, Pd, Bd)

∧

(Pc − Bc) ⊑ (Pd − Bd) ∧κ(anchora, ECb) 6∈ trace ⇒

κ(anchora, ECb) ∪ trace ⊢ Φ(Υ(anchora), E, ∅) ECb

⇓

κ(anchora, ECb) 6∈ trace ⇒κ(anchora, ECb) ∪ trace ⊢

Φ(Υ(Φ(anchora, Pc, Bc)), E, ∅) Φ(ECb, Pd, Bd)

mLemma A.9.1 ∧ Lemma A.9.2 ∧ Lemma A.9.3

Theorem A.9.5

(Pc − Bc) ⊑ (Pd − Bd) ∧ ECa ECb

⇓Φ(ECa, Pc, Bc) Φ(ECb, Pd, Bd)



A.10 Ω is monotone

In this section we prove the same property for the Ω function.


Lemma A.10.1

Φ((Pa, Ba), E, ∅) (Pb, Bb)⇓

Φ(Ω((Pa, Ba), prea, a1 . . . an), E, ∅) Ω((Pb, Bb), preb, b1 . . . bn)

Proof A.10.1.

Φ(Ω((Pa, Ba), prea, a1 . . . an), E, ∅) Ω((Pb, Bb), preb, b1 . . . bn)m (definition of Ω)

Φ((Pa, Ba), E, ∅) (Pb, Bb)

A.10.2 Method Expressions

Lemma A.10.2

expra exprb ∧ prea preb ∧ argsa argsb∧okΩ(argsa, (prea, this(expra))) ∧ okΩ(argsb, (preb, this(exprb)))

⇓Ω(expra, prea, argsa) Ω(exprb, preb, argsb)

Proof A.10.2.Let argsa = a1 . . . an and argsb = b1 . . . bn.

1. thisΩ(thisa, prea, a1 . . . an) Ω(thisb, preb, b1 . . . bn)

m (definition of Ω)prea preb

2. type

Ω(typea, prea, a1 . . . an) Ω(typeb, preb, b1 . . . bn)m (definition of Ω)

typea typeb

3. formal

Ω(formala, prea, (va,1, pa,1) . . . (va,n, pa,n)) Ω(formalb, preb, (vb,1, pb,1) . . . (vb,n, pb,n))

A.10 Ω is monotone 195

(a) formala = pa,i

Because of the definition of the relation and the given assumptions,formalb = pb,i.

va,i vb,i

(b) formala 6= pa,i

Because of the definition of the relation and the given assumptions,formalb 6= pb,i.

formala formalb

4. new C(args)

Ω(new C(argsa), prea, a1 . . . an) Ω(new C(argsb), preb, b1 . . . bn)m (definition of Ω)

Ω(argsa, prea, a1 . . . an) Ω(argsb, preb, b1 . . . bn)m (induction on finite expression tree)

true

5. t.var

Ω(ta.vara, prea, a1 . . . an) Ω(tb.varb, preb, b1 . . . bn)m (definition of Ω)

Ω(ta, prea, a1 . . . an).vara Ω(tb, preb, b1 . . . bn).varb

m (definition of )Ω(ta, prea, a1 . . . an) Ω(tb, preb, b1 . . . bn) ∧ vara varb

m (induction on finite expression tree)true

6. t.m(args)

Ω(ta.m(arga,1, . . . , arga,n), prea, a1 . . . an) Ω(tb.m(argb,1, . . . , argb,n), preb, b1 . . . bn)

m (definition of Ω)Ω(ta, prea, a1 . . . an).m(Ω(arga,1, prea, a1 . . . an), . . .

, Ω(arga,n, prea, a1 . . . an)) Ω(tb, preb, b1 . . . bn).m(Ω(argb,1, preb, b1 . . . bn), . . .

, Ω(argb,n, preb, b1 . . . bn))m (definition of )

Ω(ta, prea, a1 . . . an) Ω(tb, preb, b1 . . . bn)∧Ω(arga,1, prea, a1 . . . an) Ω(argb,1, preb, b1 . . . bn) ∧ . . .∧

Ω(arga,n, prea, a1 . . . an) Ω(argb,n, preb, b1 . . . bn)m (induction on finite expression tree)

true



Direct Compatibility

Lemma A.10.3

prea preb ∧ argsa argsb ∧ Φ(anchora, E, ∅) anchorb

okΩ(argsa, (prea, this(anchora))) ∧ okΩ(argsb, (preb, this(anchorb)))⇓

Φ(Ω(anchora, prea, argsa), E, ∅) Ω(anchorb, preb, argsb)

Proof A.10.3.Let anchora = like ta.ma(a1, . . . , an) E Pa 6E Ba, and letanchorb = like tb.mb(b1, . . . , bn) E Pb 6E Bb.

Φ(Ω(anchora, prea, argsa), E, ∅) Ω(anchorb, preb, argsb)m

like Ω(ta.ma(a1 . . . an), prea, argsa) E (Pa ⊓ E) 6E Ba like Ω(tb.ma(b1 . . . bn), preb, argsb) E Pb 6E Bb

m (definition of )Ω(ta.ma(a1 . . . an), prea, argsa) Ω(tb.mb(b1 . . . bn), preb, argsb) ∧

((Pa ⊓ E) − Ba) ⊑ (Pb − Bb)⇑ (Lemma A.10.2 and preconditions)

((Pa ⊓ E) − Ba) ⊑ (Pb − Bb)m (Φ(anchora, E, ∅) anchorb)

true

Compatibility After Expansion

Lemma A.10.4 Let anchor = like t.m(arg1, . . . , argm).

prea preb ∧ argsa argsb∧(

κ(anchor, ECb) 6∈ trace ⇒κ(anchor, ECb) ∪ trace ⊢ Φ(Υ(anchor), E, ∅) ECb

)

∧

okΩ(argsa, (prea, this(anchor))) ∧ okΩ(argsb, (preb, this(ECb)))

⇓

κ(anchor, ECb) 6∈ trace ⇒κ(anchor, ECb) ∪ trace ⊢

Φ(Υ(Ω(anchor, prea, argsa)), E, ∅) Ω(ECb, preb, argsb)

Proof A.10.4. We prove the lemma using induction on Lemmas A.9.4 and A.10.5.We expand the anchored exception declaration one level, or go to the exceptionclause of a submethod of the method referenced by anchor, and assume that thelemmas hold for the resulting exception clause and ECB. Because we add anchorto the trace and stop if we receive it again, and because these other lemmas

A.10 Ω is monotone 197

themselves only perform further expansions, this induction must end in methodswhose exception clauses contain no anchored exception declarations or in the trivialcase where anchor is already processed.

If κ(anchor, ECb) ∈ trace, the lemma is trivially true. We now prove the lemmafor κ(anchor, ECb) 6∈ trace.

Before we perform the induction on Υ(anchor) and ECb, we need to verify thatthe precondition of Lemma A.10.5 is satisfied. The first three preconditions followdirectly from the preconditions of this lemma. The fourth precondition is satisfiedbecause the type of this in Υ(anchor) is the same as the type of this in anchor.The last preconditions follow directly from the preconditions of this lemma.

κ(anchor, ECb) ∪ trace ⊢ Φ(Υ(anchor), E, ∅) ECb

⇓ (induction on Theorem A.10.5)κ(anchor, ECb) ∪ trace ⊢

Ω(Φ(Υ(anchor), E, ∅), prea, argsa) Ω(ECb, preb, argsb)⇓ (induction on Theorem A.8.8)

κ(anchor, ECb) ∪ trace ⊢Φ(Υ(Ω(anchor, prea, argsa)), E, ∅) Ω(Φ(Υ(anchor), E, ∅), prea, argsa)

⇓κ(anchor, ECb) ∪ trace ⊢


As explained in the proof of Lemma A.8.7 the transitivity property of is in-directly based on this lemma. Because of the expansion done in this lemma andbecause we keep a trace, the induction must end. On this side, it will end in eitherLemma A.10.1 or A.10.3. Now we only need to prove the left-hand side of the lastimplication. From the definitions of Ω and Ω, we know that:

Ω(anchor, prea, argsa) = like Ω(t, prea, argsa).m(Ω(arg1, prea, argsa),. . . , Ω(argm, prea, argsa)) E P 6E B

The actual arguments arg1, . . . , argm are bound respectively to formal parameterspar1, . . . parm. As a result, we can prove the induction step as follows:

κ(anchor, ECb) ∪ trace ⊢Φ(Υ(Ω(anchor, prea, argsa)), E, ∅) Ω(Φ(Υ(anchor), E, ∅), prea, argsa)

m (definition of Υ)κ(anchor, ECb) ∪ trace ⊢

Φ(Ω(Φ(ε(Ω(anchor, prea, argsa)), P, B), Ω(t, prea, argsa),Ω((arg1, par1), prea, argsa) . . .Ω((argm, parm), prea, argsa)), E, ∅)

Ω(Φ(Ω(Φ(ε(anchor), P, B), t,(arg1, par1) . . . (argm, parm)), E, ∅), prea, argsa)


Because ε(anchor) is the exception clause of a method of the program, it canonly reference the formal parameters of its method, being par1, . . . , parm. As aresult, Lemma A.4.3 may be applied. The filter operations may be merged due toLemma A.4.1 and the definition of Φ.

m (Lemma A.4.3)κ(anchor, ECb) ∪ trace ⊢

Ω(Φ(ε(Ω(anchor, prea, argsa)), P ⊓ E, B), Ω(t, prea, argsa),Ω((arg1, par1), prea, argsa) . . . Ω((argm, parm), prea, argsa))

Ω(Φ(ε(anchor), P ⊓ E, B), Ω(t, prea, argsa),Ω((arg1, par1), prea, argsa) . . . Ω((argm, parm), prea, argsa))

Because of Lemma A.6.1, Lemma A.10.2, and the preconditions of this lemma, weknow that:

method(Ω(anchor, prea, argsa)) <: method(anchor)

As a result, we know that according to rule 2:

ε(Ω(anchor, prea, argsa)) ε(anchor)⇓ (Lemma A.8.4)

κ(anchor, ECb) ∪ trace ⊢ε(Ω(anchor, prea, argsa)) ε(anchor)

Now we use induction on Lemmas A.9.4 and A.10.5 to prove the induction step.All that is left is proving that their preconditions are satisfied.

1. For the application of Φ, the preconditions of Theorem A.9.5 are met becauseboth sides use the same sets of types and the relation above.

2. For the application of Ω, the first precondition of Lemma A.10.5 followsfrom the application of Lemma A.9.4. The second and third preconditionsare satisfied because the prefixes and actual arguments are identical. Thelast preconditions are satisfied because of Lemmas A.6.1 and A.10.2.


Theorem A.10.5

trace ⊢ ECa ECb ∧ prea preb ∧ argsa argsb∧okΩ(argss, (prea, this(ECa))) ∧ okΩ(argsb, (preb, this(ECb)))

⇓trace ⊢ Ω(ECa, prea, argsa) Ω(ECb, preb, argsb)

A.11 The Implementation Exception Clause is an Upper Bound 199

Proof A.10.5. The proof of this lemma is similar to that of Lemma A.9.4. Afterrewriting the expression trace ⊢ Ω(ECa, prea, argsa) Ω(ECb, preb, argsb), weobtain the following conditions for this lemma to be true:

Φ(ABSa, E, ∅) ABSb

⇓Φ(Ω(ABSa, prea, argsa), E, ∅) Ω(ABSb, preb, argsb)

∧

prea preb ∧ argsa argsb ∧ Φ(anchora, E, ∅) anchorb

∧okΩ(argsa, (prea, this(anchora))) ∧ okΩ(argsb(preb, this(anchorb)))⇓

Φ(Ω(anchora, prea, argsa), E, ∅) Ω(anchorb, preb, argsb)

∧

prea preb ∧ argsa argsb∧(

κ(anchor, ECb) 6∈ trace ⇒κ(anchor, ECb) ∪ trace ⊢ Φ(Υ(anchor), E, ∅) ECb

)

∧

okΩ(argsa, (prea, this(anchor))) ∧ okΩ(argsb, (preb, this(ECb)))

⇓

κ(anchor, ECb) 6∈ trace ⇒κ(anchor, ECb) ∪ trace ⊢


mLemma A.10.1 ∧ Lemma A.10.3 ∧ Lemma A.10.4

Theorem A.10.6

ECa ECb ∧ prea preb ∧ argsa argsb∧okΩ(argss, (prea, this(ECa))) ∧ okΩ(argsb, (preb, this(ECb)))

⇓Ω(ECa, prea, argsa) Ω(ECb, preb, argsb)


A.11 The Implementation Exception Clause is an

Upper Bound

Theorem A.11.1 The implementation exception clause of a non-abstract methodis an upper bound for the exceptional behaviour of the implementation of thatmethod.

Proof A.11.1. This theorem follow obviously from the definition of the imple-mentation exception clause and the Java Language Specification.


A.12 Method Invocations Maintain Compatibil-

ity

Theorem A.12.1 Let t.m(arg1, . . . , argn) be a method invocation in a valid pro-gram, let ECb = ε(t.m(arg1, . . . , argn)) and let pari be the formal parameter cor-responding to argi.

ECa ECb ∧ Γ(this(ECb)) = Γ(this(ECa))

⇓Ω(ECa, t, (arg1, par1) . . . (argn, parn)) Υ(t.m(args))

Proof A.12.1.The requirements for substitution in ECa are satisfied because theprogram is valid. Since ECa ECb, they are also valid if the type of this is thesame. For formal parameters, the type must be invariant.

Ω(ECa, t, args) Υ(t.m(args))m

Ω(ECa, t, args) Ω(Φ(ECb,⊤, ∅), t, args)

mΩ(ECa, t, args) Ω(ECb, t, args)

Because ECa ECb, it suffices to prove that the preconditions of Theorem A.10.6are satisfied. The preconditions all follow directly from the preconditions of thislemma and the fact that ECb = ε(t.m(arg1, . . . , argn)).

A.13 The relation implies the ω relation

In this section, we prove that when the relation holds between two exceptionclauses, the left-hand side cannot signal an exception that is not allowed by theright-hand side.

First, we define the τ function, which replaces the target and arguments of ananchored exception declaration by their static type. The definition of shown inFigure A.2.


Lemma A.13.1

Φ((Pa, Ba), E, ∅) (Pb, Bb)⇓

ω((Pa, Ba), E) ⇒ ω((Pb, Bb), E, trace)

A.13 The relation implies the ω relation 201

τ(like t.m(a1, . . . , an) E P 6E B) = Γ(t).m(Γ(a1), . . . , Γ(an))τ(t1.m1(a1,1, . . . , a1,n1

), . . . , τ(tn.mn(a1,n, . . . , a1,nn) =

τ(Γ(t1).m1(Γ(a1,1), . . . , Γ(a1,n1)), . . . ,

τ(Γ(tn).mn(Γ(a1,n), . . . , Γ(a1,nn))

Figure A.2: Definition of τ .

Proof A.13.1.Φ((Pa, Ba), E, ∅) (Pb, Bb)⇓ (definition of Φ and )((Pa ⊓ E) − Ba) ⊑ (Pb − Bb)

⇓ (definition of ⊆)E E ((Pa ⊓ E) − Ba) ⇒ E E (Pb − Bb)

⇓ (definition of E and ⊓)E E (Pa − Ba) ⇒ E E (Pb − Bb)

⇓ (definition of ω)ω((Pa, Ba), E) ⇒ ω((Pb, Bb), E)

⇓ (definition of ω)ω((Pa, Ba), E) ⇒ ω((Pb, Bb), E, trace)


Lemma A.13.2

ω(EC, E, trace1) ∧ trace2 ⊆ trace1

⇓ω(EC, E, trace2)

Proof A.13.2. If trace2 is a superset of trace1, its analysis will return false whilethe analysis for trace1 might give true. The reverse can never happen.

Lemma A.13.3

ω(Φ(Υ(anchor), E, ∅), trace)m

ω(Φ(Υ(anchor), E, ∅), τ(anchor) ∪ trace)

Proof A.13.3. Because the method referenced by the anchored exception decla-ration is analyzed anyway in both cases, it does not matter that in the relationwithout a trace, the analysis is done a second time (after which it is in the traceand will not be analyzed again).


Lemma A.13.4


⇓(ω(anchora, E, τ(trace)) ⇒ ω(anchorb, E))

Proof A.13.4. If Γ(anchora) ∈ τ(trace), the lemma is trivially true because of thedefinition of ω. It makes the left-hand side of the bottom implication false, makingthe bottom implication true, which in turn makes the complete implication true.This will stop the induction loop.


⇓ (Lemma A.6.2)method(anchora) <: method(anchorb)

⇓ (Rule 2)ε(anchora) ε(anchorb)

We will now use Theorems A.9.5 and A.10.6.

Υ(Φ(anchora, E, ∅)) Υ(anchorb)m

Ω(Φ(ε(anchora), (Pa ⊓ E), Ba), ta, argsa) Ω(Φ(ε(anchorb), Pb, Bb), tb, argsb)

The preconditions of Theorems A.9.5 and A.10.6 are satisfied because anchora anchorb, and thus Υ(Φ(anchora, E, ∅)) Υ(anchorb).

Υ(Φ(anchora, E, ∅)) Υ(anchorb)⇓ (LemmaA.8.4)

trace ∪ τ(anchora) ⊢ Υ(Φ(anchora, E, ∅)) Υ(anchorb)⇓ (Induction on Lemma A.13.5)

ω(Υ(Φ(anchora, E, ∅)), E, τ(anchora) ∪ τ(trace)) ⇒ω(Υ(anchorb), E)⇓ (LemmaA.13.3)

ω(Υ(Φ(anchora, E, ∅)), E, τ(trace)) ⇒ ω(Υ(anchorb), E)⇓ (definition of ω)

ω(Φ(anchora, E, ∅), E, τ(trace)) ⇒ ω(anchorb, E)m (Lemma A.5.1)

ω(anchora, E, τ(trace)) ⇒ ω(anchorb, E)

For the induction, we perform a one-level expansion. Because we put τ(anchorb),which substitutes the target and arguments of the method expression by theirtypes, in the trace, this induction will always end. The base cases are excep-tion clauses that only contain absolute exception declarations. For such exceptionclauses, Theorem A.13.5 is proven by Lemma A.13.1.

A.13 The relation implies the ω relation 203


Theorem A.13.5

trace ⊢ ECa ECb ⇒ (ω(ECa, E, τ(trace)) ⇒ ω(ECb, E))

Proof A.13.5.

trace ⊢ ECa ECb

m (definition of )

(∀(Pa, Ba)∈ ECa, ∀E, ω((Pa, Ba), E) :∃(Pb, Bb) ∈ ECb : Φ((Pa, Ba), E, ∅) (Pb, Bb)) ∧

(∀AEDa ∈ECa, ∀E, ω(AEDa, E) : ∃AEDb ∈ ECb :Φ(AEDa, E, ∅) AEDb ∨κ(AEDa, ECb) 6∈ trace ⇒

κ(AEDa, ECb) ∪ trace ⊢ (Φ(Υ(AEDa), E, ∅) ECb))

As a result, for every E, we can find ABSb,xiand AEDb,yi

such that:

⇓ (Lemmas A.13.1 and A.13.4 and A.5.1)

ω(ABSa,1, E, τ(trace)) ⇒ ω(ABSb,x1, E) ∧

. . . ∧ω(ABSa,n, E, τ(trace)) ⇒ ω(ABSb,xn

, E) ∧

(ω(AEDa,1, τ(trace)) ⇒ ω(AEDb,y1, E) ∨

κ(AEDa,1, ECb) 6∈ trace ⇒κ(AEDa,1, ECb) ∪ trace ⊢ (Φ(Υ(AEDa,1), E, ∅) ECb))

∧

. . . ∧

(ω(AEDa,m, τ(trace)) ⇒ ω(AEDb,ym, E) ∨

κ(AEDa,m, ECb) 6∈ trace ⇒κ(AEDa,m, ECb) ∪ trace ⊢ (Φ(Υ(AEDa,m), E, ∅) ECb))

If κ(AEDa,i, ECb) ∈ trace, the proof ends for that anchored excep-tion declaration. If this is the case, τ(AEDa,i) ∈ τ(trace), which meansω(AEDa,i, E, τ(trace)) = false. If that is not the case, we use induction. Assuch, we assume κ(AEDa,i, ECb) 6∈ trace for the remainder of the proof.


⇓ (Lemma A.5.1 and induction on Theorem A.13.6)

ω(ABSa,1, E, τ(trace)) ⇒ ω(ABSb,x1, E) ∧

. . . ∧ω(ABSa,n, E, τ(trace)) ⇒ ω(ABSb,xn

, E) ∧

(ω(AEDa,1, E, τ(trace)) ⇒ ω(AEDb,y1, E) ∨

ω(Υ(AEDa,1), E, τ(AEDa,1) ∪ τ(trace)) ⇒ω(ECb, E))

∧

. . . ∧

(ω(AEDa,m, E, τ(trace)) ⇒ ω(AEDb,ym, E) ∨

ω(Υ(AEDa,m), E, τ(AEDa,1) ∪ τ(trace)) ⇒ω(ECb, E))

⇓ (Definition of ω and Lemma A.13.3)

ω(ABSa,1, E) ⇒ ω(ABSb,x1, E, τ(trace)) ∧

. . . ∧ω(ABSa,n, E) ⇒ ω(ABSb,xn

, E, τ(trace)) ∧(

(ω(AEDa,1, E) ⇒ ω(AEDb,y1, E, τ(trace)) ∨

ω(AEDa,1, E) ⇒ ω(ECb, E, τ(trace)))

)

∧

. . . ∧(

(ω(AEDa,m, E) ⇒ ω(AEDb,ym, E, τ(trace)) ∨

ω(AEDa,m, E) ⇒ ω(ECb, E, τ(trace)))

)

⇓ (definition of ω)ω(ECa, E) ⇒ ω(ECb, E, τ(trace))

Theorem A.13.6

ECa ECb ⇒ (ω(ECa, E) ⇒ ω(ECb, E))

Proof A.13.6. The proof follows directly from Lemma A.13.6

A.14 Expansion Does Not Allow More Than the

Exception Clause

In this section, we prove that the exception clause resulting from the expansion ofa method invocation does not allow more exception to be signalled than the ex-ception clause of the invoked method. This property is important from a method-ological point of view. If it were allowed, a method invocation could be allowedto signal a checked exception that could not have been foreseen by looking onlyto the exception clause of the method. This is very confusing for a programmer.For example, the expansion function could simply return throws Throwable. This

A.14 Expansion Does Not Allow More Than the Exception Clause 205

would not compromise compile-time safety, but it would make anchored exceptiondeclarations useless.

Lemma A.14.1

ω(Φ((P, B), Pn, Bn), E) ⇒ ω((P, B), E)

Proof A.14.1.

ω(Φ((P, B), Pn, Bn), E) ⇒ω((P ⊓ Pn, B ⊔ Bn), E) ⇒

E E ((P ⊓ Pn) − (B ⊔ Bn)) ⇒E E P ∧ E E Pn ∧ E 6E B ∧ E 6E Bn ⇒

E E P ∧ E 6E B ⇒ω((P, B), E)

Lemma A.14.2

ω(Φ(anchor), Pn, Bn), E) ⇒ ω(anchor, E)

Proof A.14.2.Let anchor = like t.m(args) E P 6E B.

ω(Φ(anchor, Pn, Bn), E) ⇒ ω(anchor, E)m

ω(Υ(Φ(anchor, Pn, Bn)), E) ⇒ ω(Υ(anchor), E)m (Φ does not affect the method expression)

ω(Ω(Φ(ε(anchor), P ⊓ Pn, B ⊔ Bn), t, args), E) ⇒ω(Ω(Φ(ε(anchor), P, B), t, args), E)

m (Lemma A.4.1 and definition of Φ)ω(Φ(Ω(Φ(ε(anchor), P, B), t, args), Pn , Bn), E) ⇒

ω(Φ(Ω(Φ(ε(anchor), P, B), t, args), >, ∅), E)

We now use Theorems A.10.6 and A.9.5.

• The first three preconditions of A.10.6 are satisfied because the correspondingelements in the equation above are identical. The last preconditions followfrom the fact that the program must be valid and the fact that Φ does notaffect the method expression of anchor and thus does not affect the selectedmethod either.

• The preconditions of Theorem A.9.5 are satisfied because (Pn − Bn)) (> − ∅).


As a result, we know that:

Φ(Ω(Φ(ε(anchor), P, B), t, args), Pn , Bn) Φ(Ω(Φ(ε(anchor), P, B), t, args), >, ∅)

Applying Theorem A.13.6 completes the proof.

Lemma A.14.3

ω(Φ(EC, P, B), E) ⇒ ω(EC, E)

Proof A.14.3.

ω(Φ(EC, P, B), E)m

ω(Φ(ABS1, . . . , ABSn AED1, . . . , AEDm, P, B), E)m (definition of Ω)

ω(Φ(ABS1, P, B), . . . , Φ(ABSn, P, B)Φ(AED1, pre, args), . . . , Φ(AEDm, pre, args), E)

m (definition of ω)ω(Φ(ABS1, P, B)E) ∨ . . . ∨ ω(Φ(ABSn, P, B), E)∨ω(Φ(AED1, P, B), E) ∨ . . . ∨ ω(Φ(AEDm, P, B), E)

⇓ (Lemmas A.14.1 and A.14.2)

ω(ABS1, E) ∨ . . . ∨ ω(ABSn, E)∨ω(AED1, E) ∨ . . . ∨ ω(AEDm, E)

mω(EC, E)

Lemma A.14.4

okΩ(args, (pre, this(AED)))⇓

ω(Ω(AED, pre, args), E) ⇒ ω(AED, E)

Proof A.14.4.Let AED = like t.m(a1, . . . , an) E P 6E B.

ω(Ω(AED, pre, args), E) ⇒ ω(AED, E)m (definition of ω)

ω(Υ(Ω(AED, pre, args)), E) ⇒ ω(Υ(AED), E)m (definition of Υ)

ω(Ω(Φ(ε(Ω(AED, pre, args)), P, B), Ω(t, pre, args),Ω(a1, pre, args) . . . Ω(an, pre, args)), E) ⇒

ω(Ω(Φ(ε(AED), P, B), t, args), E)

A.14 Expansion Does Not Allow More Than the Exception Clause 207

Because okΩ(args, (pre, env(AED))), we know from Lemmas A.6.1 and A.10.2that:

Γ(Ω(t, pre, args)) <: Γ(t)∧Γ(Ω(a1, pre, args)) <: Γ(a1) ∧ . . .Γ(Ω(an, pre, args)) <: Γ(an)

As a result, we know that the method selected by Ω(AED, pre, args) will overrideor be equal to the method selected by AED. This means that rule 2 applies.

ε(Ω(AED, pre, args)) ε(AED)

We now apply Theorems A.9.5 and A.10.6 to the arguments of ω in the impli-cation above.

• The preconditions of Theorem A.9.5 are satisfied because the arguments ofΦ are identical.

• The first precondition of Theorem A.10.6 follows from the application of The-orem A.9.5. The second and third preconditions follow from Lemma A.10.2.The last preconditions follow from the preconditions of this lemma, fromLemmas A.6.1 and A.10.2, from the fact that the types of the target ofa method invocation must be conform to the type of this in the invokedmethod, from the fact that the type of the actual arguments must be con-form to that of the invoked method, and from the requirement that types offormal parameters must be invariant.

As a result, we know that:

Ω(Φ(ε(Ω(AED, pre, args)), P, B), Ω(t, pre, args),Ω(a1, pre, args) . . . Ω(an, pre, args))

Ω(Φ(ε(AED), P, B), t, args)

Applying Theorem A.13.6 completes the proof.

Lemma A.14.5

okΩ(args, (pre, env(EC)))⇓

ω(Ω(EC, pre, args), E) ⇒ ω(EC, E)

Proof A.14.5.The proof of this lemma is nearly identical to that of LemmaA.14.3.

Theorem A.14.6

ω(Υ(AED), E) ⇒ ω(ε(AED), E)

Proof A.14.6. This theorem follows directly from Lemmas A.14.5 and A.14.3.


EC ω(EC, E)

EC′ ω(EC′, E)

IEC ω(IEC, E)

IEC′ ω(IEC′, E)

Ω

Ω

Figure A.3: Schema for final compile-time safety proof.

A.15 Compile-time safety

Now we can finally prove that anchored exception declarations are compile-timesafe. For compile-time safety to be violated, there must be at least one methodof which the implementation can signal a checked exception under a circumstancethat could not have been predicted by the client when inspecting the exceptionclause of that method. We now show that this is not possible for a program satis-fying all rules.

Figure A.3 illustrates the proof. The exception clause of the method is repre-sented by EC, its implementation exception clause by IEC. We know from rule3that IEC EC, so Theorem A.12.1 ensures that after insertion of the con-text information of any call-site, resulting in EC′ and IEC′, EC′ IEC′ holds.Note that at run-time, the available context information is even more specific, butbecause the same information is inserted in both exception clauses, the relationbetween IEC′ and EC′ will still hold. Both relations are shown in the left diagram.

Using Theorem A.13.6 and Lemma A.14.5, we can transform the left diagraminto the right diagram. Theorem A.13.6 ensures that ω(IEC, E) ⇒ ω(EC, E) andω(IEC′, E) ⇒ ω(EC′, E). Lemma A.14.5 ensures that ω(EC′, E) ⇒ ω(EC, E)and ω(IEC′, E) ⇒ ω(IEC, E). Both relation are shown in the right diagram.

From these relations, we can conclude that no method invocation can result ina checked exception that was not declared by the exception clause of the invokedmethod.

Appendix B

Type System for the

Component Relation

In this section, we present a part of our formal model. More details can be foundin the technical report [vDS06]. Our model is based both on ClassicJava [FKF98]and Featherweight Java [IPW01]. Because our inheritance mechanism supportsrenaming, the static type of the target is required to determine the invoked methodor accessed field. We use the type elaboration of ClassicJava to incorporate thatinformation in the program. The rest of the model is based on Featherweight Javabecause of its simplicity.

To model the essence of our inheritance mechanism, we added multiple inheri-tance, separation of subtyping and code inheritance, named inheritance relations,component parameters, indirect inheritance, and simple renaming to the Feather-weight Java model. Other elements have been omitted to keep the model simple. Incase of a conflict, the conflicting elements must be overridden by a new definition.Because we do not model component classes, non-conformance and feature hidingare not allowed. We assume that all component relations have been given a name,and that classes with component parameters are abstract.

B.1 Syntax

The syntax of the language is shown in Figure B.1. The differences with Feather-weight Java are the component parameters α, the two inheritance relations, andthe expressions e.i for component references and e@α for invocations on compo-nent parameters. The subtyping relation cannot have a name. Variable δ rangesover both component parameters and inheritance names.

209

210 Type System for the Component Relation

P ::= L eL ::= class C (α) ST SC F K Mα ::= T → C

δ ::= α | i

ST ::= subtype C (δ) [n = o]

SC ::= component C (δ) i [n = o]M ::= C m(C x)return e;e ::= x | i | e.f | e.i | e@α| e.m(e) | new C(e) | (C)e

Figure B.1: Syntax.

B.2 Type Elaboration

Because methods can be renamed, we must perform type elaboration, as donein [FKF98] for static methods. They combine the elaboration rules and the well-formedness rules. The elaboration rules for our model are almost the same as thosefor ClassicJava. The only difference is that we also elaborate the static type of thetarget of a method invocation while in ClassicJava this is done only for instancevariables. We do not repeat them here. The only effect is the insertion of the statictype of the target of a method invocation or instance variable access.

e.m(args)⇒ e:Γ(e).m(args)

e.f ⇒ e:Γ(e).f

The typing of the non-elaborated program is almost identical to that of theelaborated program except that the actual type are used instead of the static typesin rules T − field, T − Comp, T − Comp − Param, and T − invk.

The well-formedness rules are written separate from the elaboration.

B.3 Subtyping and Subclassing

The subtyping rules and subclassing are shown in Figures B.2 and B.3. The sub-typing rules come straight from Featherweight Java[IPW01]. The subclassing rulesare similar. Note that the second judgment declares that the subtyping relationimplies the subclassing relation. The subtyping relation is represented by the <:relation, the subclassing relation by the < relation.

B.4 Class Well-formedness

Figure B.4 shows the class well-formedness rules. Rules 8 and 9 ensure that nocycles are present in the inheritance relations. Note that rule 9 has an extra judge-

B.4 Class Well-formedness 211

(1)C <: C

C <: D D <: E(2)

C <: E

class C . . . subtype D . . . . . . (3)

C <: D

Figure B.2: Subtyping

(4)C < C

C <: D(5)

C < D

C < D D < E(6)

C < E

class C . . . component D . . . . . . (7)

C < D

Figure B.3: Subclassing


ments to forbid cyles mixing both kinds of relations. Rule 10 ensures that if com-ponent parameterse are passed to the same type via different subtyping paths, thevalues that end up in that type are the same for all such paths. This is similar tothe rule for generic parameters in Java, Eiffel, and SmartEiffel. Rule 11 puts allnamed elements of a class in the same namespace and demands that all names areunique within a class. Rule 13 demands that the actual component parameters ofall inheritance relations conform to the corresponding formal component parame-ters. The conformance relation is defined in rules 14 and 15. Rule 14 defines theconformance relation if the actual parameter is the name of an inheritance relation.The name is valid if containing type in the constraint of the formal componentparameter contains a component with the given name, and that component hasthe correct type. Rule 15 defines the conformance relation if the actual parame-ter itself is a formal component parameter. In this case, the component relationpasses on the value of a formal component parameter of the reusing class. Theformal component parameter used as an actual parameter conforms to the actualcomponent parameter if the containing type of the former is a supertype of thecontaining type of the latter and the component type of the former is a subtypeof the component type of the latter.

These well-formedness rules must be conjugated together with the rules forfields and methods, which are discussed further on.

B.5 Components

Figure B.5 shows the lookup rules for components. The components functionsearches a component with name i in class C. Rule 17 is trivial, and declaresthat if the class contains a component relation with the proper name, that rela-tion is returned. In rule 18, the component is search in the supertypes if it is notdefined in the current class. Note that in the recursive call to components thepossible renaming of the component relation is taken into account by applying thereverse rename mapping to the inheritance name. Rule 19 contains an additionaldefinition that take a subtyping relation as an argument. It is used to simplify thenotation in Figure B.9.

Figure B.6 shows the definition of the params function. The params function isused to determine which actual component parameters are passed to a given type Sby a given subtyping relation. In rule 20, the search is routed to the target class ofthe subtyping relation. The actual parameters are passed in the third argument.Before the actual arguments are passed, however, the σ is used to prevent anychange in the meaning of an inheritance name. Because the inheritance nameis specified relative to the containing type of a formal component parameter, themeaning of that name might change if it is passed further on to a formal componentparameter with a different containing type. This can happen if the componentrelation is renamed. Therefore, its name is prefixed immediately with the name of

B.5 Components 213

E = class C(α) ST SCF K M ∀subtype T(δ) [n=p] ∈ ST : ¬T <: C(8)

E NO-ST-LOOPS OK


∀component T(δ) i [n=p] ∈ ST : ¬T <: C

∀component T(δ) i [n=p] ∈ ST : ¬T < C(9)

E NO-CO-LOOPS OK


∀ T, ∀ STa, STb ∈ ST :STa <: T ∧ STb <: T ⇒ params(STa, T) = params(STb, T)

(10)E SUBTYPE SAME PARAM OK


∀X, Y∈ componentNames(C) ∪ fields(C) ∪ methods(C) :X 6= Y ⇒ name(X) 6= name(Y)

(11)E NAMESPACE OK

(12)componentNames(C) = i|component(i, C) = X CO


∀class T(γ) . . . :∀STi = subtype T(δ) . . . : δ ≤ γ ∧∀SCi = component T(ε) . . . : ε ≤ γ

(13)E COMP PARAMS OK

α = T → C

component(i, T) = X component Y (η)[u=v] xδ = iY <: C

(14)δ ≤ α

α = T → C δ = S → D D <: C T <: S(15)

δ ≤ α

component(i, T )overridescomponent(j, S)(16)

T • i ≤ S • j

Figure B.4: Class Well-formedness.


E = class C(α) ST SCF K M component T (δ) [n=p] i ∈ SC(17)

component(i, C) = C component T (δ) [n=p] i


subtype T (ε) [n=p] s ∈ ST

component U (δ) [m=q] i 6∈ SC

component([p/n]i, T) = X component Y (η)[u=v] x(18)

component(i, C) = X component Y (η)[u=v] x

(19)component(i, subtype T (α)[n = p]j) = component([p/n]i, T)

Figure B.5: Lookup of components.

the containing type in rule 24. A special symbol is used because inheritance namescan contain dots in case of nested components. Prefixed inheritance names andformal component parameters are passed on unchanged because of rule 25. In rule22 the search is routed to all the subtyping relation of the class, while replacingthe component parameters of the current class by the provided actual parameters.The search stops either at the root class, returning an empty set, or at rule 23 ifthe type has been found, returning the actual component parameters that havebeen passed to that class.

Figure B.7 shows the rules for overriding components. Rule 26 states that acomponent relation overrides inherited component relations with the same name.Again, renaming is taken into account by applying the reverse rename mappingto the inheritance name before search a component in the supertypes. Rule 27state that the overrides relation is transitive. Rule 28 states that a componentrelation can only override another if its component type is a subtype of the com-ponent type of the overridden component relation. In addition, parameters passedto the overriding component type must conform to the parameters passed to theoverridden component type by the overridden component relation.

Figure B.8 defined when two component relations are related to each other.Rule 29 states that two component relations are related if they both override a thirdcomponent relation. Rule 30 states that a component is related to a component itoverrides. The relation can be made symmetric and reflexive, but the that is notnecessary for the well-formedness rules for components.

B.5 Components 215

T <: S E = class T(α) ST SCF K M(20)

params(subtype T(δ)[n=p], S) = params(E, S, σ(δ, α))

T <: S E = class T(α) ST SCF K M(21)

params(componentT(δ)[n=p], S) = params(E, S, σ(δ, α))

S 6= T(22)

params(class T(α)ST . . . , S, δ) = params([δ/α]ST, S)

(23)params(class T(α)ST . . . , T, δ) = δ

α = T → C δ = i (24)σ(δ, α) = T•i

α = T → C δ 6= i(25)

σ(δ, α) = δ

Figure B.6: Component parameters.


SCj = component c S<γ> [m=n]

STi = subtype T<δ> [o=p]

component([p/o]c, T) = U component V(ε) x [o=r](26)

C SCj overrides U component V(ε) x [q=r]

C F overrides D G D G overrides E H (27)C F overrides E H

E = class C(α) ST SCF K M SCi = component S(ε) [o=q] s

∀X, SCT =component T(δ) [n=p] t :C SCi overrides X SCT ⇒ S <: Tparams(SCi, T ) ≤ params(SCT , T )

(28)C SCi OVERRIDE OK

Figure B.7: Component overriding.


∃ E O : (C M overrides E O) ∧ (D N overrides E O)(29)

C M related to D N

C M overrides D N (30)C M related to D N

Figure B.8: Related component relations.


C SC OVERRIDE OK (31)E COMPONENT OVERRIDE OK


∀n :|i|component(n, STi) = ...| > 1 ⇒∃j : SCj = componentT (. . .) [...] n

(32)E COMPONENT SELECT OK


∀i, j, n1, n2 :component(n1, STi) related to component(n2, STj) ⇒ n1 = n2

(33)E NO COMPONENT DUPLICATION OK

Figure B.9: Component well-formedness.

B.5.1 Component Well-formedness

Figure B.9 shows the well-formedness rules for component relations. Rule 31 statesthat a class is well-formed with respect to component overriding if all of its com-ponent relations are well-formed. Rule 32 states that if multiple components withthe same name are inherited via subtyping relations, a new component must bedefined with the same name. Rule 33 states that if two related components areinherited via subtyping relations, their names are identical. This triggers rule 32to avoid duplication.

B.6 Fields 217

B.6 Fields

In this section, we discuss how field are treated in the type system. Rememberthat field are properties that can be renamed, overridden, and merged. A field isrepresented as P F, where F is the definition of the field, and P is its enclosing class.This is needed to determine the origin of a field in case the field is renamed some-where in the inheritance path. Indirect inheritance is modeled by giving indirectlyinherited fields the name inheritanceName.f.

B.6.1 Field Lookup

The fields function return all fields of a class. Figure B.10 shows the definition ofthe fields function. Rules 34 and 35 are trivial. Rules 36 and 37 determine whichfields are inherited by the subtyping and component relations respectively. Thedifference between both functions is that inhst incorporates the rule-of-dominance.It ignores definitions that are overidden by a definition inherited via another sub-typing relation. In addition, if the same definition is inherited more than once viasubtyping with different names, the type rules demand that all versions are giventhe same name. This makes them syntactically equal, after which the set definitionmerges them. This is not the case for the component relation, where duplicationis the default policy. Rules 38 and 39 determine which fields can be inherited viaa specific subtyping or component relation. They take the fields of the inheritedclass, and apply the renaming τst and τco. These functions are defined in rules40-43 and 44-45 respectively. The τst function is divided in four parts. In rules 41and 42, no renaming is done. Rule 40 deals with the renaming of an individualfield, while rule 43 deals with renaming as a consequence of renaming a componentrelation. Note that the latter rules change the parent class of the method to theinheriting class. Rules 44 and 45 do the same for the component relation , butthey also change the this reference by this.inheritanceName.

Figure B.11 shows the field function, which is used to find a field, given itsname and the static (T) and actual (C) types of the target. Rule 46 covers the casewhere the name of the requested field is in fields(T ). Rule 47 covers method thatare inherited directly, but are accessed indirectly. They are not present directlyin fields(T ), but there is a trail of overrides and same as relations between therequested method and a method in fields(T ). Renaming of component relationsis taken into account by looking up the actual relation, which may have a differentname than head.

The rule for the lookup of the type of a field is trivial, and is shown in rule 48in Figure B.12.

The overrides relation is shown in Figure B.13. Rule 49 describes the standardoverriding relation for subtyping. Rule 50 for the component relation is similar, butit inserts the name of the component relation before the name of the overriddenfield to disinguish it from other inheritance instances of the same field. In addition,


E = class C (α) ST SC F K M(34)

fields(C) = fields(E)

E = class C (α) ST SC F K M(35)

fields(E) = C F ∪ inhf,st(E) ∪ inhf,co(E)

inhf,st(E = class C (α)ST SCF K M) =U G|¬G overridden in E ∧ U G ∈ fields(ST, C)∧ (36)6 ∃ V O ∈ fields(ST, C) : V O 6= U G ∧ V O overrides U G

inhf,co(E = class C (α)ST SCF K M) =U G|¬G overridden in E ∧ U G ∈ fields(SC, C) (37)

fields(T) = C F(38)

fields(subtype T(δ) [n=p], D) = τst(n = p, D, C F)

fields(T) = C F(39)

fields(component T(δ) i [n=p], D) = τco(i, n = p, D, C F)

f ∈ n(40)

τst(n = p, T, C D f) = T D [p/n]f

f 6∈ n . 6∈ f(41)

τst(n = p, T, C D f) = C D f

f 6∈ n f = head.tail . 6∈ head head 6∈ n(42)

τst(n = p, T, C D f) = C D f

f 6∈ n f = head.tail . 6∈ head head ∈ n(43)

τst(n = p, T, C D f) = T D ([p/n]head).tail

f ∈ n(44)

τco(i, n = p, T, C D f) = T D [p/n]f

f 6∈ n(45)

τco(i, n = p, T, C D f) = T D i.f

Figure B.10: Fields of a class.

B.6 Fields 219

fields(T) = W U g name = gi

V F ∈ fields(C)

V F overrides Wi Ui gi ∨ V F same as Wi Ui gi (46)field(name, T, C) = F

name = head.tail

. 6∈ head

component(head, T) = X component Y (δ) i [n=p]

U S tail ∈ fields(Y)

V F ∈ fields(C)

V F overrides X S i.tail∨ V F same as X S i.tail (47)field(name, T, C) = F

Figure B.11: Field lookup.

field(f, T, C) = U D g(48)

ftype(f, T, C) = D

Figure B.12: Field type lookup.

it introduces indirectly inherited fields in the inheriting class. If the field is inheriteddirectly, the indirect version can still be used, and is dynamically bound becauseof the overrides relation. Rule 51 makes the overrides relation transitive, and rule52 takes the same as relation into account. Conformance of fields is enforced byrule 53, and rule 54 determines if a field is overridden in a class.

The same as relation is shown in Figure B.14. Rules 55 and 56 state that thesame as relation is reflexive and transitive. Rule 57 and states that a renamed feildis the same as the field with the previous name. Because the name of the componentrelation is added for fields inherited via a component relation, a same as relationis defined to link the field accessed via indirect inheritance – using the inheritancename – and the field accessed using the short name – in case it is renamed.

B.6.2 Field Well-formedness

Figure B.15 shows the field well-formedness rules. Rules 59, 60, and 62 are similarto those for component relations. Rule 61 states that for overriding components,fields must be renamed to the name of the field of the overridden component,which is inherited via a suptyping relation.


STk = subtype T (δ) [o=p]


τst(δ, o=p, β, C, D A g) = C A fj

class T (β) ...

D A g ∈ fields(T)(49)

C Fj overrides D A g

STk = component T(δ) i [o=p]


τco(T, δ, i, o=p, β, C, D A g) = C A fj

class T (β) ...

D A g ∈ fields(T)(50)

C Fj overrides D A i.g

C F overrides D G ∨ C F same as D G

D G overrides E H (51)C F overrides E H

C F overrides D G D G same as E H (52)C F overrides E H

B = T f

∀ C D=C U g : A B overrides C D ⇒ T = U(53)

A B OVERRIDE OK

E = class C (α)ST SCF K M Fi = C A g(54)

g overridden in E

Figure B.13: Field overriding.

B.6 Fields 221

(55)C F same as C F



STi = subtype T [o=p] i

τst([o=p], C, i, D G) = C A h

D A g ∈ fields(T)

¬h overridden in E(57)

C A h same as D A g


STi = component T [o=p] i

τco([o=p], i, C, D G) = C A h

D A g ∈ fields(T)

¬h overridden in E(58)

C A h same as D A i.g

Figure B.14: Field equivalence.



∀ D N, E O ∈ fields(ST, C) : D N related to E O ⇒ n = o(59)

E NO FIELD SUBTYPING DUPLICATION


∀ o :

(

|N=P D g|g=o ∧ U N ∈ (inhf,st(E))| > 1⇒ ∃ B o ∈ F

)

(60)class C(α) ST SCF K M F-SELECT OK


SCi = component S(δ) j [y=z]

∀X, CO = component T(ε) [q=r] i : C SCi overrides X CO ⇒∀U A n ∈ fields(ST, C), V A o ∈ fields(T) :

U A n related to X A i.o ⇒∀C A p ∈ fields(SCi, C) :(

C A p related to Z A j.q∧field(o, T, S) = Z A q

)

⇒ n=p

(61)E COMPONENT FIELD OVERRIDE RENAME OK


C F OVERRIDE OK (62)class C(α) ST SCF K M FIELD OVERRIDE OK

Figure B.15: Field well-formedness.

B.7 Methods 223

B.7 Methods

B.7.1 Method Lookup

A method is represented as P M, where M is the definition of the method, andP is its enclosing class. This is needed to determine the origin of a method. In-direct inheritance is modeled by giving indirectly inherited methods the nameinheritanceName.m.

Figure B.16 shows the definition of the methods function. Rules 63 and 64 aretrivial. Rules 65 and 66 determine which methods are inherited by the subtypingand component relations respectively. The difference between both functions is thatinhst incorporates the rule-of-dominance. It ignores definitions that are overiddenby a definition inherited via another subtyping relation. In addition, if the samedefinition is inherited more than once via subtyping with different names, thetype rules demand that all versions are given the same name. This makes themsyntactically equal, after which the set definition merges them. This is not thecase for the component relation, where duplication is the default policy. Rules67 and 68 determine which methods can be inherited via a specific subtyping orcomponent relation. They take the methods of the inherited class, and apply therenaming, and substitution of component parameters by using τst and τco. Thesefunctions are defined in rules 69-72 and 73-74 respectively. The τst function isdivided in four parts. In rules 70 and 71, no renaming is done. Rule 69 deals withthe renaming of an individual method, while rule 72 deals with renaming as aconsequence of renaming a component relation. Note that the latter rules changethe parent class of the method to the inheriting class. Rules 73 and 74 do thesame for the component relation , but they also change the this reference bythis.inheritanceName. Finally, the τ function in rules 75 and 76, substitutesthe component parameters. If the replacement is a component parameter, the @symbol must be kept. If the value is the name of an actual component relation, itis replaced by a dot.

Figure B.17 shows the method function, which is used to find a method, givenits name and the static (T) and actual (C) types of the target. Rule 77 coversthe case where the name of the requested method is in methods(T ). Rule 78covers method that are inherited directly, but are accessed indirectly. They arenot present directly in methods(T ), but there is a trail of overrides and same asrelations between the requested method and a method in methods(T ). Renamingof component relations is taken into account by looking up the actual relation,which may have a different name than head.

Figure B.18 shows the method type lookup, and Figure B.19 shows the methodbody lookup. Both rules are trivial.

The overrides relation is shown in Figure B.20. Rule 81 describes the standardoverriding relation for subtyping. Rule 83 for the component relation is similar,but it inserts the name of the component relation before the name of the overrid-


den method to disinguish it from other inheritance instances of the same method.In addition, it introduces indirectly inherited methods in the inheriting class. Ifthe method is inherited directly, the indirect version can still be used, and is dy-namically bound because of the overrides relation. Rule 82 makes the overridesrelation transitive, and rule 84 takes the same as relation into account. Confor-mance of methods is enforced by rule 85, and rule 86 determines if a method isoverridden in a class.

The same as relation is shown in Figure B.21. Rules 87 and 88 state that thesame as relation is reflexive and transitive. Rules 89 and 90 state that a renamedmethod is the same as the method with the previous name. As for overriding, thename of the component relation is added for methods renamed in a componentrelation.

B.7.2 Method Well-formedness

The well-formedness rules are shown in Figure B.23. Rules 94, 95, 96, and 97are similar to those for field well-formedness. Rules 98 and 98 define when theimplementation of a method is valid.

B.8 Auxiliary functions

B.8.1 Abstract

Since the formal model demands that inheritance parameters are filled in by asubclass, we make classes with inheritance parameters abstract, as shown in FigureB.24.

B.9 Expression Typing

The rules for expression typing are shown in Figure B.25. Because fields, methods,and inheritance relations can be renamed, we need to perform type elaboration asdone in [FKF98].

Because there is no type for a.b if b is an inheritance relation, there can be noambiguity for expr.f in case of e.g. a.b.c.d.e. If b and c are variables, the onlyvalid match is expr=a.b.c and f=d.e.

Note that there are no reduction rules for T −Comp−Field and T −Comp−Invk because they cannot occur in a running program. Types with inheritanceparameters are abstract, and all parameters must be filled in by subclasses.

Because the formal model allows references to subcomponents, we must adda condition to T − Field and T − ink to ensure that only a single rule can beapplicable at a time. Otherwise, there would be two possibilities for e.i.f and

B.9 Expression Typing 225


methods(C) = methods(E)


methods(E) = C F ∪ inhst(E) ∪ inhco(E)

inhst(E = class C(α)ST SCF K M) =U N|¬N overridden in E ∧ U N ∈ methods(ST, C)∧ (65)

6 ∃ V O ∈ methods(ST, C) : V O 6= U N ∧ V O overrides U N

inhco(E = class C(α) ST SCF K M) =U N|¬N overridden in E ∧

U N ∈ methods(SC, C)(66)


methods(subtype T(δ) [n=o], C) = τst(δ, n = o, α, C, P M)


methods(component T(δ) i [n=o], C) = τco(T, δ, i, n = o, α, C, P M)

m ∈ n(69)


C B [o/n]m(B x) return τ (δ, α, e);

m 6∈ n . 6∈ m(70)



m 6∈ n m = head.tail . 6∈ head head 6∈ n(71)



m 6∈ n m = head.tail . 6∈ head head ∈ n(72)


C B ([o/n]head).tail(B x) return τ (δ, α, e);

m ∈ n(73)


C B [o/n]m(B x) return [this:C.i/this:T]τ (δ, α, e);

m 6∈ n(74)


C B i.m(B x) return [this:C.i/this:T]τ (δ, α, e);

δ = T → C (75)τ (δ, α, e) = [@δ/@α]e

δ = i (76)τ (δ, α, e) = [.δ/@α]e

Figure B.16: Methods of a class.


methods(T) = V N name = ni U M ∈ methods(C)

U M overrides Vi Ni ∨ U M same as Vi Ni (77)method(name, T, C) = M

name = head.tail

. 6∈ head

component(head, T) = X component Y(δ) i [n=o]

N = B i.tail(B x) ...

U B tail(B x) ... ∈ methods(Y)

(U M overrides X N ∨ U M same as X N)

U M ∈ methods(C)(78)

method(name, T, C) = M

Figure B.17: Method lookup.

method(m, T, C) = U B n(B x)return e;(79)

mtype(m, T, C) = B → B

Figure B.18: Method type lookup.

method(m, T, C) = U B n(B x)return e;(80)

mbody(m, T, C) = x.e

Figure B.19: Method body lookup.




τst(δ, o=p, β, C, D N) = C A ml...

class T<β> ...



C Ml overrides D N

δTi

BD N overrides E O

C M overrides D N ∨ C M same as D N (82)C M overrides E O



τco(T, δ, i, o=p, β, C, D N) = C A ml...

class T<β> ...



C Ml overrides D A i.n(A x)return e;

Ti

B

C M overrides D N D N same as E O (84)C M overrides E O


∀ D N=A n(A y)... : C M overrides D N ⇒ (B <: A ∧ B = A)(85)

C M OVERRIDE OK


E = class C(α) ST SCF K M m ∈ M(86)

B m(B x) return e; overridden in E

Figure B.20: Method overriding.


X (87)C M same as C M




τst(δ, [o=p], β, C, D N) = C A h(A x)return e;

class T(β) ...


D N ∈ methods(T)

¬h overridden in E (89)C A h(A x)return e; same as D N



τco(T, δ, i, [o=p], β, C, D N) = C A h(A x)return g;

class T(β) ...


D N ∈ methods(T)

¬h overridden in E (90)C A h(A x)return g; same as D A i.n(A x)return e;

Figure B.21: Method equivalence.

∃ E O : (C M overrides E O) ∧ (D N overrides E O)(91)

C M related to D N

C M overrides D N (92)C M related to D N

C M same as D N (93)C M related to D N

Figure B.22: Method relations.



∀ D N, E O ∈ methods(ST, C) : D N related to E O ⇒ n = o(94)

E NO METHOD SUBTYPING DUPLICATION


∀ o :

(

|N=T A o(. . .)...|U N ∈ (inheritedMethods(E))| > 1⇒ ∃ B o(. . .)... ∈ M

)

(95)class C(α) ST SCF K M METHOD SELECT OK


SCi = component S(δ) [y=z] j

∀X, CO = component T(ε) [q=r] i : C SCi overrides X CO ⇒∀U A n... ∈ methods(ST, C), V A o... ∈ methods(T) :U A n... related to X B i.o... ⇒

∀C D p... ∈ methods(SCi, C) :(

C D p... related to Z E q...∧method(o, T, S) = Z E q...

)

⇒ n=p

(96)E COMPONENT METHOD OVERRIDE RENAME OK


C M OVERRIDE OK (97)class C(α) ST SCF K M METHOD OVERRIDE OK


C M IMPLEMENTATION OK (98)class C(α) ST SCF K M IMPLEMENTATION OK

x : C, this : C ⊢ e0 : E0 E0 <: C0(99)

C C0 m(C x) return e0 IMPLEMENTATION OK

Figure B.23: Method well-formedness

E = class C(α) ST SCF K M α 6= ∅(100)

C abstract

Figure B.24: The abstract judgement.


e.i.m(), being either (e.i).f and (e.i).m() or (e).i.f and (e).i.m(). The same goesfor [email protected] and [email protected]().

B.10 Reduction Rules

The computation and congruence rules are shown in Figures B.26 and B.27. Notethat the ∆ environment is not needed because it is only required for the compile-time type-check. Also note that there is no rule for e@i because the parameterwill be substituted during method selection by the actual component name. Thisis proven in Lemma B.11.9.

Also note that there is not even a rule for e.i if i is an inheritance relation. Theexpression will remain unchanged until it is the target of a method invocation orfield access. In both case only one rule applies: (e).i.f and (e).i.m().

B.10 Reduction Rules 231

(T − var) (101)∆; Γ ⊢ x : Γ(x)

∆; Γ ⊢ e0 : C0 field(f, T, C0) = D g . 6∈ f(T − Field) (102)

∆; Γ ⊢ e0 : T.f : D

component(i, T) = component D<β> [n=p] j(T − Comp) (103)

∆; Γ ⊢ e0 : T.i : D

∆(α) = T → C(T − Comp − Param) (104)

∆; Γ ⊢ e0 : T@α : C

∆; Γ ⊢ e0 : C0

∆; Γ ⊢ e : C

mtype(m, T, C0) = D → C

C <: D . 6∈ m(T − Invk) (105)

∆; Γ ⊢ e0 : T.m(e) : C

¬C abstract ∧ fields(C) = T D f ∆; Γ ⊢ e : C C <: D(T − New) (106)

∆; Γ ⊢ new C(e) : C

∆; Γ ⊢ e0 : D D <: C(T − UCast) (107)

∆; Γ ⊢ (C)e0 : C

∆; Γ ⊢ e0 : D C <: D C 6= D(T − DCast) (108)

∆; Γ ⊢ (C)e0 : C

∆; Γ ⊢ e0 : D C 6<: D D 6<: C stupid warning(T − SCast) (109)

∆; Γ ⊢ (C)e0 : C

Figure B.25: Expression typing.


field(f, T, C) = D g fields(C) = U D g g = gi (R − Field) (110)new C(e) : T.f → ei

mbody(m, T, C) = x.e0(R − Invk) (111)

new C(e):T.m(d) → [d/x, new C(e)/this]e0

C <: D(R − Cast) (112)

(D)(new C(e)) → new C(e)

Figure B.26: Computation Rules.

e0 → e′0 (RC − Field) (113)e0 : T.f → e′0 : T.f

e0 → e′0 (RC − Invk − Rcv) (114)e0 : T.m(e) → e′0 : T.m(e)

ei → e′i (RC − Invk − Arg) (115)e0 : T.m(...,ei, . . .) → e0 : T.m(...,e′i,...)

ei → e′i (RC − New − Arg) (116)new C(...,ei,...) → new C(...,e′i,...)

e0 → e′0 (RC − Cast) (117)(C)e0 → (C)e′0

Figure B.27: Congruence Rules.

B.11 Proof of Type Soundness 233

B.11 Proof of Type Soundness

We prove the type soundness of our model by isolating the assumptions madein the proof of Featherweight Java, and proving that our inheritance mechanismsatisfies those assumptions.

B.11.1 Subject Reduction

In Lemma B.11.6, we must make a change because our mtype function has anextra argument, being the static type of the target. In the proof, the lemma isused to prove that the type of a method invocation is a subtype after substitution.In other words, it suffices to prove that for a given static type, a more specificactual type will result in a more specific return type.

Lemma B.11.1 The component judgement defines a function.

Proof B.11.1. This follows directly from rule T − Comp − Select.

Lemma B.11.2 The method judgement defines a function.

Proof B.11.2. From the definition of method, it follows that it is a function ifmethods is a function. Rule T − Namespace ensures that for methods directlydefined in a class C, there can only be one match for a given name. The othermethods in methods(C) come from either inhst or inhco. But because these func-tions remove methods with the same name as a method in C, there can never bemore than one result.

Lemma B.11.3 Every method m in a supertype or component of class C is eitherin the set of methods of C, or an overriding or equal method is present in that set.E = class C(α) ST SCF K M ∧ U N ∈ methods(STi) ∪ methods(SCi)

⇓∃ V O ∈ methods(C) : V O overrides U N ∨ V O same as U N

Proof B.11.3. The only reason a method m of a supertype or component of classC can be absent from the set of methods of C is that it has been removed by eitherthe inhst or inhco function.

For inhst, this means that the method is either overridden in the class itself, oran overriding method has also been inherited. In both cases there is an overridingrelation between the removed method and a method in methods(C) according tothe definition of overrides. If the method is not removed, the same as relationholds.

The proof for inhco is similar.

Lemma B.11.4 The field judgement defines a function.


Proof B.11.4. Similar to that of Lemma B.11.2

Lemma B.11.5 Every field f in a supertype or component of class C is either inthe set of fields of C, or a field overriding it is present in that set.

E = class C(α) ST SCF K M ∧ U D g ∈ fields(STi) ∪ fields(SCi)⇓

∃ V E h ∈ fields(C) : V E h overrides U D g ∨ V E h same as U D g

Proof B.11.5. Similar to that of Lemma B.11.3.

Lemma B.11.6 If mtype(m, T, C) = B → B0 then mtype(m, T, D) = B →E0 with E0 <: B0 for all S <: T, D <: S

Proof B.11.6. From Lemma B.11.3, it follows that method(m, T, D) selects amethod that either overrides or is the same as method(m, T, C). As a result, thislemma follows from the OVERRIDE OK judgement.

Because we also allow overriding of fields, we must have a similar lemma forthe type of a field. The only difference is that the type of a field cannot change.

Lemma B.11.7 If ftype(f, T, C) = B then ftype(f, T, D) = B for all S <: T, D <:S

Proof B.11.7. Similar to the proof of Lemma B.11.6Lemma B.11.8 needs to be modified from the Featherweight Java version be-

cause we have two new type expressions.

Lemma B.11.8 If ∆; Γ, x : B ⊢ e : D, and Γ ⊢ d : A where A <: B, thenΓ ⊢[d/x]e : C for some C <: D.

Proof B.11.8. Cases T −V ar, T −New, T −UCast, T −DCast, and T −SCastremain unchanged. Case T − Invk only requires a syntactic modification to passthe static type from the type elaboration to Lemma B.11.6. Case T − Fieldbecome nearly identical to T − Invk because of the possible overriding.

Case T-Comp. e=e0:T.i

The actual type of e0 is not used in the type of e0:T.i, hence the type of e willnot change under substitution. Safety is guaranteed by induction. The componentrelation will exist.

Case T-Comp. e=e0:T@α

Again, the actual type of e0 is not used in the type of e0:T@α, hence the typeof e will not change under substitution. Safety is guaranteed by induction. Thecomponent relation will exist.

B.11 Proof of Type Soundness 235

Lemma B.11.9 The expression e:T@α will not be encountered during evaluationof a program.

Proof B.11.9. Classes with component parameters are abstract and thus cannotbe instantiated. From the fact that all parameters must be filled in by subtype andcomponent clauses, and the substitution of component parameters in the methodsfunction, it follows that such expressions cannot occur in methods of concreteclasses. The proof follows from the fact that a program is of the form new C(e)

and must be well-typed.

Lemma B.11.10 If∆; Γ ⊢ e : C, then ∆; Γ, x : D ⊢ e : C

Proof B.11.10. Straightforward.

Lemma B.11.11 The transformation of the method body of a method inheritedvia a subtyping relation is type safe.

class T(α) ... ∧α : B → D, x : X, this : T ⊢ e : E ∧ δ ≤ α ∧

class S(β) ... subtype T(δ) ...

⇓x : X, this : S ⊢ τ(δ, α, e) : F ∧ F <: E.

Proof B.11.11. The lemma follows directly from the definition of the τ functionand Lemma B.11.8.

Lemma B.11.12 The transformation of the method body of a method inheritedvia a component relation is type safe.

class T(α) ... ∧γ : G, x : X, this : T ⊢ e : E ∧ δ ≤ α ∧

class S(β) ... component T(δ) i ...

⇓x : X, this : S ⊢ [this:S.i/this:T]τ(δ, α, e) : F ∧

F <: E.

Proof B.11.12. Because of rule T − Comp, the type of this:C.i is T, so thelemma follows from Lemmas B.11.8 and B.11.11.

For Lemma 1.4 of the Featherweight Java proof, we provide a different lemmabecause the method body is altered when inherited instead of during the evalua-tion.

Lemma B.11.13 Ifmtype(m, T, C) = D → D, and mbody(m, T, C), thenx :D, this : C0 ⊢ e : C with C <: D


Proof B.11.13. In case method(m, T, C) is defined in C, rule T −Implementationproves the Lemma. In case method() is inherited either through a subtype relationor a component realtion, we must prove that the altered method body maintainsa valid type. This follows directly from Lemmas B.11.11 and B.11.12.

Theorem B.11.14 ((Subject Reduction)) For a well-typed expression e of anelaborated program:If Γ ⊢ e : C and e → e′, then Γ ⊢ e′ : C′ for some C′ <: C

Proof B.11.14. The proof is nearly identical to the proof of Theorem 2.4.1 ofFeatherweight Java.

B.11.2 Progress

Theorem B.11.15 ((Progress)) Suppose e is a well-typed expression in theevaluation of an elaborated program.

1. If e includes new C0(e).f as a subexpression, then fields(C0) = C T f andf ∈ f for some C, T, and f.

2. If e includes new C0(e).m(d) as a subexpression, then mbody(m, C0) = x.e0

and #(x) = #(d) for some x and e0.

Proof B.11.15. Because component parameters cannot occur in the evaluation ofan elaborated program (Lemma B.11.9), and operations performed on componentreferences are treated as method invocations and field accesses, the proof is thesame as that of Theorem 2.4.2 of Featherweight Java except that it now usesLemmas B.11.2, B.11.3, B.11.4, and B.11.5 to prove that m and f are present.

B.11.3 Type Soundness

The domain of values is the same as that of Featherweight Java.

v ::= new C(v)

Theorem B.11.16 ((Type Soundness)) Suppose e is an ex-pression of an elaborated program. If ∅ ⊢ e : C and e →∗

e′ with e′ in normal form, then e′ is either a value v with ∅ ⊢ v :D and D <: C, or an expression containing (D)new C(e)where notC <: D.

Proof B.11.16. Immediate from Theorems B.11.14 and B.11.15.

abstractions for improving, creating, and reusing object ... · programming environments, the new...

Documents