e. casais 63 - unigeasg.unige.ch/site/papers/casa90a.pdf · e. casais 61 classes to be reorganized,...

64

Upload: phamthien

Post on 02-Dec-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

E. Casais 63

[36] B. Maher and D. H. Sleeman, “Automatic Program Improvement: Variable Usage Transformations,” ACM Transactions on Programming Languages and Systems, vol. 5, no. 2, pp. 236–264, ACM, April 1983.

[37] Bertrand Meyer, Object-Oriented Software Construction, Series in Computer Science, Prentice-Hall Interna-tional, 1988.

[38] Bertrand Meyer, “The New Culture of Software Development: Reflections on the Practice of Object-Oriented Design,” in TOOLS 89 proceedings, pp. 13–23, Paris, 13–15 November 1989.

[39] David A. Moon, “Object-Oriented Programming with Flavors,” in Proceedings of the Conference on Object-oriented Programming Systems, Languages and Applications (OOPSLA ’86) Special Issue of SIGPLAN No-tices, vol. 21, no. 11, pp. 1–8, Portland, Oregon, 2 September–29 October 1986.

[40] K. Narayanaswamy, “Static Analysis-based Program Evolution Support in the Common Lisp Framework,” in Proceedings of the 10th International Conference on Software Engineering, pp. 222–230, IEEE, Sin-gapore, 11–15 April 1988.

[41] G. T. Nguyen and D. Rieu, “Schema Change Propagation in Object-Oriented Databases,” in IFIP 89 proceed-ings, ed. G. X. Ritter, pp. 815–820, Elsevier Science Publisher B.V. (North-Holland), 1989.

[42] O. M. Nierstrasz, “Object-Oriented Concepts,” in Active Object Environments, ed. D. C. Tsichritzis, pp. 1–17, Centre Universitaire d’Informatique, Genève, 1988.

[43] D. Jason Penney and Jacob Stein, “Class Modification in the GemStone Object-Oriented DBMS,” in Proceed-ings of the Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA ’87) Special Issue of SIGPLAN Notices, vol. 22, no. 12, pp. 111–117, Orlando, FL, 4–8 October 1987.

[44] Steve Putz, “Managing the Evolution of Smalltalk-80 Systems,” in Smalltalk-80: Bits of History, Words of Advice, ed. Glenn Krasner, pp. 273–286, Addison-Wesley, Reading, Massachusetts, 1984.

[45] Markku Sakkinen, “Comments on the Law of Demeter and C++,” SIGPLAN Notices, vol. 23, no. 12, pp. 34–44, ACM, 1988.

[46] Andrea H. Skarra and Stanley B. Zdonik, “The Management of Changing Types in an Object-Oriented Data-base,” in Research Directions in Object-Oriented Programming, eds. Bruce Shriver and Peter Wegner, pp. 393–415, The MIT Press, Cambridge, Massachusetts, 1987.

[47] Dave Thomas and Kent Johnson, “Orwell: a Configuration Management System for Team Programming,” in OOPSLA 88 Proceedings, pp. 135–141, 25–30 September 1988.

[48] Peter Wegner and Stanley B. Zdonik, “Inheritance as an Incremental Modification Mechanism or What Like Is and Isn’t Like,” in ECOOP 88 Proceedings, eds. Stein Gjessing and Kristen Nygaard, Lecture Notes in Computer Science, pp. 55–77, Springer Verlag, Oslo, 15–17 August 1988.

[49] André Weinand, Erich Gamma, and Rudolf Marty, “ET++: An Object-Oriented Application Framework in C++,” in Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Appli-cations (OOPSLA ’88) Special Issue of SIGPLAN Notices, vol. 23, no. 11, pp. 46–57, November 1988.

[50] Roberto Zicari, “Schema Updates in the O2 Object-Oriented Database System,” report 89-057, Politecnico di Milano, Dipartimento di Elettronica, Milano, October 1989.

62 Managing Class Evolution in Object Oriented Systems

[14] Nachum Dershowitz, “Programming by Analogy,” in Machine Learning: an Artificial Intelligence Approach (vol. II), eds. Ryszard S. Michalski, Jaime G. Carbonell and Tom M. Mitchell, pp. 395–423, Morgan Kauf-mann oublishers, Los Altos (CA), 1986.

[15] V. Dhar and M. Jarke, “Dependency Directed Reasoning and Learning in Systems Maintenance Support,” IEEE Transactions on Software Engineering, vol. 14, no. 2, pp. 211–227, IEEE, February 1988.

[16] Jim Dietrich and Jack Milton, “Experimental Prototyping in Smalltalk,” IEEE Software, vol. 4, no. 3, pp. 50–64, IEEE, May 1987.

[17] Gerhard Fischer, Andreas C. Lemke and Christian Rahtke, “From Design to Redesign,” in Proceedings of the 9th International Conference on Software Engineering, pp. 369–376, IEEE, Monterey CA, 30 March–2 April 1987.

[18] D. H. Fishman, J. Annevelink, D. Beech, E. Chow, T. Connors, J. W. Davis, W. Hasan, C. G. Hoch, W. Kent, S. Leichner, P. Lyngbaek, B. Mahbod, M. A. Neimat, T. Risch, M. C. Shan and W. K. Wilkinson, “Overview of the IRIS DBMS,” in Object-Oriented Concepts, Databases, and Applications, eds. Won Kim and Frederic H. Lochovsky, pp. 219–250, Addison-Wesley/ACM Press, 1989.

[19] Erich Gamma, André Weinand, and Rudolf Marty, “Integration of a Programming Environment into ET++: A Case Study,” in Proceedings of the Third European Conference on Object-Oriented Programming (ECOOP 89), pp 283–297, Nottingham, 10–14 July 1989.

[20] S. Gibbs, D. Tsichritzis, E. Casais, O. Nierstrasz, and X. Pintado, “Class Management for Software Commu-nities,” in Object Management, ed. D. C. Tsichritzis, Centre Universitaire d’Informatique, 1990.

[21] Adele Goldberg, The Smalltalk Programming Environment, Addison-Wesley, Reading, Massachusetts, 1984.

[22] Adele Goldberg and Daniel Robson, Smalltalk-80: The Language and its Implementation, Addison-Wesley, Reading, Massachusetts, 1983.

[23] Daniel C. Halbert and Patrick D. O’Brien, “Using Types and Inheritance in Object-Oriented Programming,” IEEE Software, pp. 71–79, IEEE, September 1987.

[24] Matthias Jarke and Thomas Rose, “Managing Knowledge about Information System Evolution,” SIGMOD Records, vol. 17, no. 3, pp. 303–311, ACM, September 1988.

[25] Ralph E. Johnson and Brian Foote, “Designing Reusable Classes,” Journal of Object-Oriented Programming, pp. 22–35, June-July 1988.

[26] J. Karimi and B. R. Konsynski, “An Automated Software Design Assistant,” IEEE Transactions on Software Engineering, vol. 14, no. 2, pp. 194–210, IEEE, February 1988.

[27] Sonya E. Keene, Object-Oriented Programming in Common LISP: A Programmer’s Guide to CLOS, Addi-son-Wesley, Reading, Massachusetts, 1989.

[28] Glenn Krasner, Smalltalk-80: Bits of History, Words of Advice, Addison-Wesley, Reading, Massachusetts, 1984.

[29] Barbara Staudt Lerner and A. Nico Habermann, “Beyond Schema Evolution to Database Reorganization,” research paper on databases, Carnegie Mellon University, Pittsburgh, February 1990.

[30] Nicole Lévy, Outils d’aide à la construction et à la transformation de types abstraits algébriques, Thèse de 3ème cycle, Université de Nancy, 1984.

[31] Qing Li and Dennis McLeod, “Object Flavor Evolution through Learning in an Object-Oriented Database System,” in Proceedings of the 2nd International Conference on Expert Database Systems, ed. Larry Kersch-berg, pp. 241–256, George Mason University, 25–27 April 1988.

[32] K. Lieberherr, I. Holland and A. Riel, “Object-Oriented Programming: an Objective Sense of Style,” in OOP-SLA 88 proceedings, pp. 323–334, 25–30 september 1988.

[33] Karl J. Lieberherr and Ian M. Holland, “Assuring Good Style for Object-Oriented Programming,” IEEE Soft-ware, pp. 38–48, IEEE, September 1989.

[34] Henry Lieberman, “Using Prototypical Objects to Implement Shared Behavior in Object-Oriented Systems,” in Proceedings of OOPSLA 86, pp. 214–223, 29 September–2 October 1986.

[35] Y. E. Lien, “Relational Database Design,” in Principles of Database Design, ed. S. Bing Yao, pp. 211–254, Prentice Hall, Englewood Cliffs, 1985.

E. Casais 61

classes to be reorganized, although they detect and decompose interface redefinitionsinto several consistent steps).

• We expect the tools and the compilers in an object-oriented environment to be capableof implementing change avoidance techniques whenever possible. It is our convictionthat lazy conversion will prevail as the technique for propagating change to instances.Screening seems cumbersome to use and might entail unacceptable overhead when ap-plied in its full extent.

We have demonstrated that class evolution is actually an aspect of the more general prob-lem of class design. Techniques for managing class modification and reorganization wouldtherefore benefit substantially from advances in design methodologies, the determination ofsound programming styles, and the disciplining of object-oriented mechanisms, in particular in-heritance.

References

[1] Serge Abiteboul and Richard Hull, “Restructuring Hierarchical Database Objects,” Theoretical Computer Science, no. 62, pp. 3–38, North-Holland, 1988.

[2] Matts Ahlsén, Anders Björnerstedt, Stefan Britts, Christer Hultén and Lars Söderlund, “Making Type Chang-es Transparent,” SYSLAB report 22, SYSLAB-S, University of Stockholm, Stockholm, 26 February 1984.

[3] Robert Balzer, “Evolution as a New Basis for Reusability,” in Proceedings of the Workshop on Reusability in Programming, pp. 80–82, Newport RI, September 1983.

[4] Jay Banerjee, Won Kim, Hyoung-Joo Kim and Henry F. Korth, “Semantics and Implementation of Schema Evolution in Object-Oriented Databases,” in Proceedings of the ACM-SIGMOD Conference on Management of Data, eds. Umeshwar Dayal and Irv Traiger, pp. 311–322, Association for Computing Machinery, San Francisco, 27–29 May 1987.

[5] David Beech and Brom Mahbod, “Generalized Version Control in an Object-Oriented Database,” in Proceed-ings 4th IEEE International Conference on Data Engineering, pp. 14–22, Los Angeles, February 1988.

[6] Anders Björnerstedt and Stefan Britts, “AVANCE: An Object Management System,” in OOPSLA 88 Pro-ceedings, ed. Norman Meyrowitz, pp. 206–221, San Diego, 25–30 September 1988.

[7] Anders Björnerstedt and Christer Hultén, “Version Control in an Object-Oriented Architecture,” in Object-Oriented Concepts, Databases, and Applications, eds. Won Kim and Frederic H. Lochovsky, pp. 451–485, Addison-Wesley/ACM Press, 1989.

[8] Alexander Borgida, “Modelling Class Hierarchies with Contradictions,” SIGMOD Records, vol. 17, no. 3, pp. 434–443, ACM, September 1988.

[9] Alexander Borgida and Keith E. Williamson, “Accommodating Exceptions in Databases, and Refining the Schema by Learning from them,” in Proceedings VLDB 1985, eds. Alain Pirotte and Yannis Vassiliou, pp. 72–81, Stockholm, 21–23 August 1985.

[10] Eduardo Casais, “Reorganizing an Object System,” in Object Oriented Development, ed. D. C. Tsichritzis, pp. 161–189, Centre Universitaire d’Informatique, Genève, 1989.

[11] Hong-Tai Chou and Won Kim, “A Unifying Framework for Version Control in a CAD Environment,” in 12th VLDB Conference Proceedings, pp. 336–344, Kyoto, 25–28 August 1986.

[12] Brad J. Cox, Object-Oriented Programming—An Evolutionary Approach, Addison-Wesley, Reading, Massa-chusetts, 1986.

[13] L. Dami, E. Fiume, O. Nierstrasz and D. Tsichritzis, “Temporal Scripts for Objects,” in Active Object Envi-ronments, ed. D. C. Tsichritzis, pp. 144–161, Centre Universitaire d’Informatique, Genève, 1988.

60 Managing Class Evolution in Object Oriented Systems

tems to take advantage of a large spectrum of tools and techniques for managing class evolution.Even without comprehensive object-oriented CASE environments, software developers willcertainly draw significant benefits from partial capabilities for mastering change in class hierar-chies—in the same way as programmers rely on the Smalltalk browser to inspect and reuse classcollections, although this specific tool does not seem to scale well for large hierarchies. Whatremains to do, then, is to validate the existing approaches in the context of real industrial or com-mercial development and production environments.

It appears clearly that an object-oriented environment would draw a maximum advantagefrom the integration of the various evolution management approaches described in this paper.For example, reorganization methodologies could be used in combination with versioning to trydifferent class designs in a secure and disciplined fashion, without losing information on previ-ous arrangements of the class hierarchy. Some techniques however do present significantly morepotential for solving or alleviating the problems of class evolution than others.

• Tailoring will remain a standard approach for adjusting inherited properties to the needsof new applications—the concept of subclassing would lose most of its power in the ab-sence of attribute redefinition. Because free tailoring can lead to chaotic inheritancestructures, we expect this set of techniques to be severely constrained in the context of aclass design methodology.

• Class surgery appears to be a prime candidate for integration with structured, language-sensitive development tools. Future object-oriented environments will likely providelanguage-sensitive tools. Thus, instead of using a standard Unix text editor to create andcorrect their programs, software engineers will probably rely on a browser or a graphicaleditor to review and modify their code, much like in the Smalltalk [21] or in the ET++[19] environments. Attaching integrity constraints and invariant preserving checks tosuch tools would greatly enhance their functionality and should not raise insurmountabledifficulties.

• Versioning appears indispensable in the context of software communities, where verylarge class collections must be shared by groups of programmers over long periods oftime [20].

• Coordinating the work of several programmers, taking into account the needs of custom-ers using different hardware platforms and software environments, keeping track of de-sign decisions and recording all information needed to build and debug a particular soft-ware release also require comprehensive capabilities for version management.

• Reorganization methods should play a prominent role during the design process. Theirpotential is currently limited by the lack of a rigorous object model and the generality ofobject-oriented mechanisms that prevents more application specific (and more interest-ing) reorganizations. Further work is needed in this area to arrive at algorithms that cap-ture the semantics of a class hierarchy and embody advanced design criteria. The currentrules used by reorganization methods are either too general and vague (“split large class-es into smaller components”) or too focused on specific details of class structures (thus,our algorithms are incapable of recognizing possible “frameworks” in a collection of

E. Casais 59

placed in different derivation paths in a version hierarchy. Furthermore, substitute functions aredefined just in the newer versions; previous class definitions remain unchanged.

When compatibility between versions cannot be achieved, one may install scope restric-tions that isolate objects pertaining to different definitions from each other:

• A forward scope restriction makes instances from a new version inaccessible to objectsfrom older versions.

• A backward scope restriction makes instances from older versions unreachable from ob-jects of new versions.

By relying on scope restrictions and compatibility relationships, it is possible to partitionthe instance set of a class in such a way that operations may be applied to any object regardlessof its version. Naturally, interoperability decreases with such a scheme, since the entities fromdifferent versions of the same class can no longer be referred to and accessed as members of onelarge pool of objects.

10.5 Evaluation

From our discussion, it appears that filtering cannot really fulfil its objective of making classchanges transparent without considerable complexity and overhead. The programmer must notonly develop a series of special-purpose functions for mapping between the variants of a class,but must also accept a degradation of application performance as these handlers accumulate, re-placing the originally simple and efficient accessors. In practice, this complexity does not appearfully warranted: with lazy conversion, for example, one has also to define ad-hoc procedures fortransforming entities from one version to another, but these procedures are called only once forevery object; their execution is therefore not as expensive as the systematic run-time checks andexception raising implied by screening techniques. On the positive side, filtering provides a rig-orous framework for defining and dealing with compatibility issues in object-oriented systems.

Screening has been implemented in some systems, but there its application scope is notablyreduced. ORION does not immediately convert instances affected by a class change so as toavoid reorganizing the database [4]; when an instance is fetched, and before its attributes are ac-cessed, deleted variables are made inaccessible (after, if needed, the physical destruction of theobjects they refer to). Default values are automatically supplied to account for the introductionof new properties. Rearrangements of inheritance patterns are reflected by hiding unwantedproperties and supplying default values for newly inherited ones. In Eiffel, methods can betagged as obsolete, thus effectively providing a two-level kind of versioning. Obsolete methodscan still be invoked, but they no longer appear in the documentation produced by the system,and references to them generate warnings at compile-time.

11. Conclusion

Object-oriented development reveals its iterative nature as successive stages of subclassing, tai-loring, class modification, version creation and reorganization allow software engineers to buildincreasingly general, reusable and robust classes. We expect therefore software information sys-

58 Managing Class Evolution in Object Oriented Systems

• A substitute read function RCijA(I) is given an instance I of version i of class C. It mapsthe values of a group of attributes from this object to a valid value of attribute A of ver-sion j of C. In other words, it makes instances of class version Ci appear as if they con-tained the attribute A of class version Cj for reading operations.

• A substitute write function WCijA(I,V) is given an instance I of version i of class C, anda value V for attribute A of Cj. It maps the value V into a set of values for a group of at-tributes defined in Ci. In other words, this function makes instances of class version Ci

appear as if they could store information in attribute A, although this information is ac-tually recorded in other variables.

Needless to say, it may be quite difficult in practice to find strictly equivalent translationsfrom one class definition to another.

A second approach favours the use of handlers to be invoked before or after a failed accessto the attribute they are attached to—a technique resembling LISP :before and :after daemons,and which has been implemented in the ENCORE system [46]. Pre-handlers typically take overwhen attempting to access a non-existent attribute, or when trying to assign an illegal value toit. A pre-handler may perform a mapping like those carried out by the substitute functions, co-erce its argument to a valid value, or simply abort the operation. A post-handler executes whenan illegal value is returned to the invoking object; a common behaviour in this case consists inreturning a default value.

10.4 Making class changes transparent

Where should filters be defined? As originally stated, the technique based on handlers requiresglobal modifications in all versions of the same class. More precisely,

• Whenever an attribute is added to a class, pre-handlers for the attribute must be intro-duced in all other versions of the class.

• Pre-handlers must be added to a version that suppresses attributes from a class definition.

• When a version extends the domain of an attribute, pre and post-handlers must be intro-duced in all other versions.

• When the domain of an attribute is restricted, the class version redeclaring the attributetype must be wrapped with a pre-handler and a post-handler.

This solution is obviously inelegant: it requires that old class definitions be adjusted to re-flect new developments, it entails a lot of cross-checking between version definitions and leadsto a combinatorial explosion of handler complexity that is avoided only at the cost of introducingspecial kinds of inheritance links in the class hierarchy.

The model of substitute functions allows one to exploit the derivation history for mappingbetween versions that have no direct relationships. Thus, one can map a version Ci to anotherversion Cj if there exist either substitute functions for them (RCijX, WCijX, where X denotes anattribute of Cj), or a succession of substitute functions that transitively apply to them (i.e. thereare substitute functions for mapping between Ci and Ck, then Ck and Cl, and eventually Cl and Cj

for example). Depending on compatibility properties, one can even relate class definitions

E. Casais 57

Similarly, one must define two symmetrical operations to guarantee forwards compatibili-ty:

• A read accessor for the birthday computes its value from the current time and the agestored in the object.

• Instead of directly storing the birthday, a write accessor records the age of the person,determined from the birthday given as argument to the accessor and the current date.

More generally, one can define so-called substitute functions for carrying out these map-pings between objects with different structures as follows:

Scope of change Compatibility Consequences

add a variable backwards undefined variable in old objects

delete a variable forwards undefined variable in new objects

extend the type backwards writing illegal values into old objects

forwards reading unknown data from new objects

restrict the type forwards writing illegal values into new objects

backwards reading unknown data from old objects

add a method backwards undefined method in old objects

delete a method forwards undefined method in new objects

extend argument type backwards passing illegal values to old objects

forwards getting unknown data from new objects

restrict argument type forwards passing illegal values to new objects

backwards getting unknown data from old objects

change argument list backwards and similar to dropping and adding a methodforwards

Table 6 Consequences of class changes on version compatibility. The middle column in-dicates which kind of compatibility is affected by a modification, the right columndescribes the exceptions raised when accessing an object from the old or thenew class definition.

56 Managing Class Evolution in Object Oriented Systems

In the first case, applications can use old instances as if they originated from new defini-tions. With the second form of compatibility, old programs can manipulate entities created onthe basis of later versions.

Each class C is associated with a partial ordering of versions { Ci }. We assume that, at anypoint in time, some Ci is considered the valid version of class C. Very often, the valid versioncorresponds to the most recent version of the class. Building on these definitions, we say that aclass version Ci is consistent with respect to version Dj of another class D (C ≠ D) if one of thefollowing conditions is satisfied [2]:

• Dj was the currently valid version of D when Ci was committed. This is the usual situa-tion; Ci references up-to-date, contemporaneous properties of D.

• Dk was the currently valid version of D when Ci was committed, Dj is a later version ofD, and Dk is forwards compatible with Dj. Here Ci references an obsolete definition of D,but the forwards compatibility property allows it to work with instances created accord-ing to the new schema.

• Dk was the currently valid version of D when Ci was committed, Dj is an earlier versionof D, and Dk is backwards compatible with Dj. Here Ci is supposed to manipulate an up-to-date representation of D; thanks to the backwards compatibility, it is nevertheless ableto use instances generated from old versions.

10.3 Filtering mechanisms

Most primitives for class evolution destroy compatibility between successive versions and re-quire the development of filters to compensate for their effects. The operations that causes prob-lems when invoked on a non-compatible object are fairly primitive, and can be classified in alimited number of categories. Adding or removing attributes generates access violations whenan object attempts to invoke a deleted method, or to read from or write into a non-existent vari-able. Changing variable or parameter types causes exceptions when assigning or passing an il-legal value to a variable or a method argument, when retrieving an unknown value from a vari-able, or when a method returns unexpected results. These effects are summarized in Table 6.

A simple way to deal with this problem is to replace each access primitive with a routinespecifically programmed to perform the mapping between different class structures. Thus, foreach variable that violates compatibility constraints, one provides a particular procedure for ac-cessing it in reading mode, and another procedure for accessing it in writing mode. These pro-cedures may perform various transformations, like mapping the variable to a set of other at-tributes [2]. For example, if the “birthday” attribute of a person class has been replaced with an“age” variable, one has to provide the following procedures to ensure backwards compatibility:

• A read accessor that determines the age of a person based on the time elapsed betweenhis recorded birthday and the current date.

• A write accessor that stores the age of a person as a birthday, computed on the basis ofthe current date and the age given as argument to the accessor.

E. Casais 55

9.4 Evaluation

Conversion, and in particular lazy conversion, seems like a very attractive technique for propa-gating changes in an object-oriented system. It requires the programming of transformationfunctions, even when the environment supports automatic conversion, but there appear to be noother alternatives for resolving intricate compatibility conflicts. When the conversion of instanc-es is infeasible, scope restriction techniques borrowed from the filtering approach may provehelpful.

10. Filtering

10.1 Issues

Conversion enforces the consistency of instance representation by physically reorganizing theobjects involved in a class modification. Whether it is immediate or deferred, this operation en-tails non-negligible processing costs and may sometimes be superfluous: under some circum-stances, one may not need to convert instances—because they have become obsolete due to classmodification, or because they represent information that is not allowed to be modified for legalreasons, like accounting records. In these situations, it is preferable to ensure a partial compati-bility between old and new object schemas, so that certain important applications may still usethem, but without striving for making them perfectly interchangeable.

Filtering (or screening, as it is often called in the literature) is a general framework for deal-ing with this problem, most often used in combination with version management. It proceeds bywrapping a software layer around objects. The layer intercepts all messages sent to the enclosedobject; these messages are then handled according to the object’s version, to make it conform tothe current or to a previous class description, or to cause an exception to be raised when an ap-plication uses an object with an unsuitable definition. Three major issues must be examined withthis approach:

• How does one characterize the degree of compatibility between different class versions?

• How can one map instances from a class version to another?

• How far can a filtering mechanism hide class changes to the users?

10.2 Version compatibility

Fundamentally, filtering is a mechanism for viewing entities of a certain class version as if theybelonged to another version of the same class. From the predecessor-successor relationship be-tween versions, we identify two types of compatibility [2]:

• A version Ci is backwards compatible with an earlier version Cj if all instances of Cj canbe used as if they belonged to Ci.

• A version Ci is forwards compatible with a later version Cj if all instances of Cj can beused as if they belonged to Ci.

54 Managing Class Evolution in Object Oriented Systems

• Perform context-dependent changes. One may initialize variables based on previous in-formation stored in the objects, or separate the instances from a class into two other cat-egories based on the information they contain.

• Move information between classes, for example by shuffling variables among classes,without losing associated information.

• Introduce new objects for classes created while updating the hierarchy, and initializetheir variables on the basis of information already stored in the database.

• Change local values to shared values.

Providing a framework to handle the most common transformations certainly eases the taskof the programmer. It is however difficult to guarantee that such a pre-determined set of primi-tives effectively covers all possibilities for object conversion. When complex adaptations cannotbe expressed with these operations, one is eventually forced to resort to special-purpose pro-grams as in CLOS.

9.3 Immediate and delayed conversion

An important constraint with conversion concerns the time at which objects must be trans-formed.

Immediate conversion consists in transforming all objects at once, as soon as the corre-sponding class modifications are committed. This solution does not find much favour in practice,because it may entail the full unloading and reloading of the persistent object store and long ser-vice interruptions if a significant number of entities has to be converted. Furthermore, it raisesmajor problems in distributed environments, where controlling the transformation of objects dis-persed over several machines is far from straightforward. On the other hand, this technique pro-vides ample opportunities for optimizing the storage and access paths to objects as part of theconversion process. Immediate conversion has been implemented in the GemStone object-ori-ented database system [43].

Lazy conversion consists in adapting instances on an individual basis, but only when theyare accessed for the first time after a class modification. This method requires keeping track ofthe status of each object. When successive revisions are carried out on the same class, the systemmust record each associated conversion procedure, to be able to transform objects that are refer-enced after a long period of inactivity. Lazy conversion does not incur the drawbacks of systemshutdown imposed by immediate conversion, at the price of degraded response time when in-stances are initially accessed after a class modification. This problem should be particularly ap-parent when a series of conversions must be sequentially applied to a dormant object, i.e. onethat has not been used while its class was repeatedly being revised. Lazy conversion is neverthe-less an appealing approach for applications with short-lived instances, that are rapidly garbage-collected and therefore do not even need to be converted. Lazy conversion attracts a lot of inter-est and is already proposed as the standard mechanism for CLOS. In order to compare the re-spective merits of immediate and lazy conversion, the O2 system will eventually implement bothtechniques [50].

E. Casais 53

• The list of names of attributes added to the class definition.

• The list of names of attributes discarded from the class.

• A list containing attribute names with their original value, for all attributes dropped fromthe class definition as well as those converted from local to shared.

• Optional initialization arguments.

This technique enables the programmer to take proper actions to correct and augment thedefault restructuring and reinitialization procedures provided by CLOS; it is thus possible to de-termine freely the mapping from an old to a new object schema.

The OTGen system provides a similar kind of functionality for transforming instances af-fected by a class modification, although this capability is presented to the user as a table-driveninterface rather than as a programming feature attached to the inheritance hierarchy [29]. A tablelists all class definitions whose instances have to be converted and suggests default transforma-tions to apply—which can of course be overridden or extended by the user. The transformationoperations possible with OTGen are as follows:

• Transfer objects belonging to the old class definition to the new database. Unchanged ob-jects are simply copied from a database to another without any conversions.

• Delete objects from the database if their class has been deleted.

• Recursively transform the variables of an object into new values, according to the trans-formation rules listed here.

• Initialize the variables of an object. When the previous and new types of a variable areincompatible, the default action taken by OTGen consists in assigning a special nil valueto it. The user can override the standard behaviour of the system by providing its owninitial value, or perhaps giving a formula to compute the new value (for example to con-vert a number to an equivalent string of characters).

New slot

Old slot Shared Local None

Shared preserved preserved discardedLocal initialized preserved discardedNone initialized initialized —

Table 5 Default conversions carried out by CLOS on objects after a class modification. Aslot corresponds to a variable. Preserved slots have their values left untouched.Discarded slots are removed and their values are lost. Initialized slots are as-signed a value determined by the class the instance belongs to.

52 Managing Class Evolution in Object Oriented Systems

• Because no arbitrary changes to the domain of variables and arguments are allowed, onecan guarantee that the values stored within existing objects remain compatible with theirnew type.

Many of the reorganization algorithms described in the previous section do not destroy in-formation present in the graph when the reorganization process is launched. Even when inherit-ance links and properties are rearranged, previous class definitions remain valid—so that we canin principle dispense with a conversion of their entities.

9. Conversion

9.1 Issues

Transforming all entities whose class has been modified seems like the most natural approachfor dealing with change propagation, and it has in fact been adopted in several object-orientedsystems. This technique implies that instances are physically updated so that their structurematches the description of the class they belong to. Two important requirements must be met:

• Because there is in general not a direct or a unique correspondence between old and newclass definitions, care has to be taken to avoid losing information.

• The conversion process has to be organized in such a way that it interferes as little as pos-sible with the normal system operations.

A consequence of the first requirement is that ad-hoc reconfiguration procedures have to beprogrammed to accompany automatic conversion processes—whose capabilities to preserve thesemantics of an application domain are obviously limited. The second requirement forces allconversion procedures to behave as atomic transactions (i.e. transformations must be appliedcompletely to the objects involved in the conversion) and puts strong restrictions on their dura-tion.

9.2 Instance transformation

CLOS provides a good example of how automatic conversion can be enhanced by the program-mer to take supplementary integrity constraints into account [27]. In CLOS, instances whoseclass definition has been modified are automatically transformed to conform to their new repre-sentation. These transformations are performed according to the rules listed in Table 5. CLOSdeletes from the objects all attributes which have been dropped from their class, including theirassociated accessors, adds and initializes those attributes that have been introduced in the classdefinition, and adapts the attributes whose status has passed from shared to local (or vice-ver-sa)1. This conversion is carried out by a standard function called update-instance-for-redefined-

class, which is inherited by every class in a hierarchy and can be customized by the program-mer—typically with an :after or a :before daemon. The arguments passed to this function, andavailable for further processing in the user-specified method, are the following:

1. A shared slot is a variable shared by all instances of the same class; it is equivalent to a class variable inSmalltalk. A local slot corresponds to a normal instance variable.

E. Casais 51

new class definition. Managing conformance relationships between successive classvariants is the main problem to consider with the filtering approach.

There is not necessarily a one-to-one correspondence between these update propagationtechniques and the class evolution methodologies we have described in the previous sections.Most methodologies try to reduce the costs associated with class evolution by drawing, whenpossible, on several approaches for adapting instances to class modifications.

8. Change Avoidance

Many class modifications do not actually require that instances be updated or enhanced with acompatibility preserving layer. Detecting when these situations arise is important, since one canthen avoid the inconvenience of change propagation without giving up system consistency.

Class tailoring is the prime candidate for change avoidance: tailoring operations are carriedout only for the purpose of defining additional subclasses; they do not at all impact previous su-perclasses. No matter how inherited properties are overridden, the modifications appear and takeeffect only at the level of the subclasses performing the redeclarations. New subclasses obvious-ly have no associated instances, so there is no need to care about converting or filtering proce-dures. The language compiler or the interpreter enforces the changes and is able to bypass thecharacteristics acquired from superclasses efficiently, notably by using dynamic binding. Thus,object-oriented systems avoid updating instances at least when subclassing operations are con-sidered.

Even when a class is directly updated, via class surgery, this does not always imply that in-stances created when an older schema was in effect must be transformed to comply with the newdefinition. Many evolution primitives exhibit no side-effects and can safely be applied withoutreorganizing running applications. Among the modification operations listed in section 3.3, thefollowing bear no consequences on object structures:

• Adding a new class is an operation without any effects, since instances of the class donot exist. The restrictions imposed when inserting a class in the middle of an inheritancehierarchy—such as disallowing the introduction of additional variables—guarantee thatsubclass definitions, and hence their instances, are not affected by this operation.

• Renaming classes, methods and variables only affects the description of classes, not thestructure of instances. This may not true for programs that explicitly manipulate class orattribute names. Objects that pass method names as arguments to other methods, or thatstore and manage information on class or attribute names, for example by relying onfunctions like class-of (which, in CLOS, returns the class name of an object), may be-come invalid after renaming operations.

• Changing the default value of a variable or a shared slot has no effect on instances, sincethese values pertain to the class definitions, not to the objects themselves.

• The implementation of a method can be changed freely: the code is associated and keptwith a class definition, to be shared among all individual entities.

50 Managing Class Evolution in Object Oriented Systems

In fact, the rather ad-hoc nature of all class reorganization approaches discussed so far high-lights three problems with object-oriented programming:

• The lack of a formal object model prevents the determination of interesting properties ofa class hierarchy (for example redundance-free inheritance relationships) and the appli-cation of transformations that preserve or improve these properties. Ideally, one wouldlike to have the equivalent of the normal forms and the normalization algorithms avail-able for the relational database model [35].

• We have always dealt with a very general description of classes. Obviously, finding ageneral way to reorganize any kind of class hierarchy is much more difficult than han-dling evolution in domain-specific object models, with more restricted structures and be-haviours, and richer semantics. Applying reorganization to “scripts” [13], for example,may reveal to be more fruitful than restructuring C++ code.

• Reorganization attacks the object modelling problem backwards, i.e. when imperfec-tions are discovered in a class library. It could perhaps be more profitable to support theinitial design phases with better tools and methodologies. A great part of our reorganiza-tion algorithms is spent trying to separate different modelling utilizations of inheritance;it would be more interesting to force software developers to respect some design guide-lines and work with operations like specialization, subtyping, modularization, etc, in-stead of letting them resort to the general, powerful, but unwieldy and undisciplined in-heritance mechanism.

CHANGE PROPAGATION

Modifications brought to class specifications must be propagated to objects instantiated on thebasis of old definitions [41]. Many environments require aborting, recompiling and restartingwhole applications when class definitions are modified; in this case, no special mechanisms forupdating instances are needed or possible. On the other hand, change propagation is a crucialissue for systems that implement persistent objects (notably object-oriented databases) or thatallow dynamic class redefinitions at run-time (like LISP-CLOS). Discarding existing instancesis evidently not feasible, since they may be involved in running applications and may containuseful, and perhaps long-lived information. To maintain the overall consistency of the system,all entities must nevertheless conform to the representation determined by the class they belongto. We distinguish three approaches to achieve this result:

• The simplest way to deal with change propagation consists in making sure that class ev-olution does not affect existing instances in any way. As we show in the following sec-tion, change avoidance is in many cases not an altogether unrealistic assumption.

• Instances can be transformed to become conformant with their new class description.Conversion implies that the structure of old entities has to be mapped to a new schema;avoiding loss of information is a delicate issue with such a procedure.

• Rather than physically updating objects, one can wrap them with an interface that filtersall accesses to them and takes appropriate actions to make them compatible with their

E. Casais 49

but rather to support the implementation of other classes. More importantly, the classes intro-duced during the decomposition, and also during the reorganization processes can serve as arough estimate for the abstractions that are missing from the modelling of an application domain.Such defects are unavoidable; it is exceptional to achieve a stable, definitive class design withoutgoing through several iterations. New classes and inheritance links correspond to the places inthe hierarchy warranting redesign.

Of course, the reorganization algorithms should be considered as an aid to design and main-tenance, not as a compulsory procedure to apply blindly to a collection of software components.It seems evident that, because these algorithms perform strictly structural transformations on ob-ject descriptions, their results require user intervention to compensate for the lack of knowledgeconcerning the application domain and the concepts embodied in the class collection. It is up tothe software developer to examine the outcome of the reorganization, to adjust it, and perhapsto embark on more comprehensive restructuring activities. When typical evolution and reorga-nization patterns emerge, they can be catalogued and help guide the design process [31].

However, it remains to be seen whether a reorganization algorithm like the one we proposeis really effective as an automatic software engineering assistant, or if simpler and less ambitioustools provide a more adequate functionality for supporting class modelling.

7.6 Assessing the potential of class reorganization

Not all possibilities for reorganizing classes are taken into account by our approach. In particu-lar, there is no facility for splitting methods, restructuring their code, or changing the client/serv-er relationship between classes—for example, there is no automatic transformation of inherit-ance into delegation. We do not believe that these are shortcomings inherent to our methodologyfor the following reasons:

• Some of the reorganization issues are already addressed by other approaches. We havealready described at some length the Law of Demeter and the dependency restructuringcapabilities afforded by this methodology. There are also techniques for producing struc-tured, goto-less programs from unstructured code [36], or for transforming abstract datastructures [1][30]. These techniques are not intended for object-oriented environments,but they may find some use in this context.

• The results of our algorithm actually provide useful information for detecting opportu-nities for further reorganization. For example, modularization relationships hint at pos-sibilities to use variables and delegation instead of inheritance, but transforming classdefinitions to adopt such a structure is extremely difficult, because of the sensitive issuesassociated with the binding of the self reference.

• Some of the problems facing software reorganization are simply not tractable. As an ex-ample, the attempts to extract the commonalities from the comparison of several proce-dures that constitute different instances of the same algorithm have met with very limitedsuccess [14]. The analysis of a table lookup module and of a root finding method to de-duce a general binary search algorithm cannot be automated and requires considerableskill to be carried out manually.

48 Managing Class Evolution in Object Oriented Systems

Var : B;---- B is not a specialization or a subtype of A.--

end ;

results in the following structure, if A and B have no common superclasses:

class MNvariables

Var : GENERIC;end ;

class Minherits MN;redefines Var;variables

Var : A;---- A is a binding of the generic type of Var.--

end ;

class Ninherits MN;redefines Var;variables

Var : B;---- B is also a binding of the generic type of Var.--

end ;

MN is a purely generic class where the type of Var has to be bound in its descendents. If A

and B have some superclasses in common, then it is possible to define a class called, say,AB-COMMON, that inherits from these common classes. Class MN defines then Var as beingof type GENERIC[AB-COMMON]; this indicates a structure of constrained genericity, a tech-nique available in some object-oriented languages, notably Eiffel.

Concretization. The implementation of deferred methods is specified at this stage. In fact, this isdone only for methods whose definition cannot be frozen sooner because their implemen-tation depends on attributes that are refined in lower subclasses. The class library is reorga-nized so as to eliminate unwanted implementations from the current class definition, or topurge it from all rejected attributes—in a manner similar to the second example of section7.3.

7.5 Evaluation

With our reorganization approach, an inheritance graph may have to be thoroughly modified inorder to accommodate a new object definition. In particular, a number of intermediate classesare inserted between the original definitions and the new schema, attributes are shuffled amongclasses, inheritance links may be redirected, etc.

Additional classes frequently represent shared modules of functionality; they correspond toconstructs, such as the “mixins” of LISP, whose main purpose is not describe real-world entities,

E. Casais 47

-- All methods are accessors for the variables.--

end ;

All manipulations of b, c and d (respectively e and f) in the original implementation of Do-

Something in class M (respectively N) must be replaced with calls to the appropriate acces-sors exported by i.

Other simpler kinds of reorganization may be required for arguments that change catego-ries, like input parameters that are redeclared as input-output arguments.

Subtyping. We consider now the redeclaration of argument and variable types. A new class iscreated for all attributes that are redefined according to a subtype relationship. Our notionof subtype is identical to the one that is generally proposed in the literature. Class A is asubtype of B if and only if:

• The interface of A defines at least the methods present in the interface of B;

• The return values of A’s methods are subtypes of those of B’s methods;

• The input arguments of B’s methods are subtypes of the input arguments of A’s methods;

• Input-output arguments for all methods common to A and B are identical.

Subtyping is an inheritance relationship with interesting properties: an instance from classA can replace an object from class B, since all messages sent to it will be understood andthe results will correspond at least to what the client of the class is expecting. The behav-ioural compatibility between classes is assumed as long as the conformance at the level ofsignatures hold.

Specialization. Specialization is an inheritance relationship with much weaker properties thansubtyping, but that conveys important semantic content. In the example of section 7.3, thespecialization of STACK1, a class for managing stacks of objects of any kind, into STACK, astack for managing real numbers, corresponds to our idea of specialization. Specializationis an important conceptual category among all utilizations of inheritance, and it should bedistinguished as such.

Type redeclarations that are neither subtyping nor specializations entail a reorganization ofthe hierarchy. This reorganization creates intermediate nodes that are subsequently refinedto give rise to the previously inherited attributes (whose definition had to be redeclared) onthe one hand, and to the new class attributes on the other hand. If possible, new classes areadded to the hierarchy to represent the common properties of the inherited and redeclaredtypes. Thus,

class Mvariables

Var : A;end ;

class Ninherits M;redefines Var;variables

46 Managing Class Evolution in Object Oriented Systems

class Mmethods

Do-Something (a : X, b : Y, c : Z, d : W) ...end ;

class Ninherits M;redefines Do-Something;methods

Do-Something (a : X, e : U, f : V) ...end ;

Three new classes are introduced in the hierarchy. A first class, common to M and N, definesthe Do-Something method and is inherited by both M and N:

class MNmethods

Do-Something (a : X, i : T) ...end ;

class Minherits MN;redefines Do-Something;methods

Do-Something (a : X, i : MT) ...end ;

class Ninherits MN;redefines Do-Something;methods

Do-Something (a : X, i : NT) ...end ;

Notice that now M and N specialize a generic parameter called i. The types NT and MT aredefined as follows:

class MTexports

Set-b, Get-b, Set-c, Get-c, Set-d, Get-d;variables

b : Y;c : Z;d : W;

methods---- All methods are accessors for the variables.--

end ;

class NTexports

Set-e, Get-e, Set-f, Get-f;variables

e : U;f : V;

methods--

E. Casais 45

exportsRead-Int, Write-Int, Read-Char, Write-Char, ..., Reset, Rewrite;

...end ;

Signature simplification. When input arguments disappear from a method signature, this is anindication that the class requires less information to carry out the same service—perhapsbecause it assumes some default values for the now superfluous parameters. An additionalsubclass is created to make this protocol simplification explicit. If DEQUE had been definedas:

class DEQUEexports

Put, Get;...methods

Put (item : OBJECT, location : 0..1)begin

-- Put item at front or back of the deque according to “location”.end Put;

Get (location : 0..1) returns OBJECTbegin

-- Retrieve item at front or back of the deque according to “location”.end Get;

end ;

then STACK would probably have been a simplification of a deque:

class STACKinherits

DEQUE;exports

Put, Get;redefines

Put, Get;methods

Put (item : OBJECT)begin

super .Put (item, 0);end Put;

Get returns OBJECTbegin

super .Get (0);end Get;

end ;

Uncoupling. Suppressing input-output arguments from the signature of methods is an importantabstraction step: it indicates that the new class encapsulates more state, and that it is nolonger necessary to exchange information between this class and its clients. A new class iscreated where all method redefinitions corresponding to an uncoupling are introduced.These methods remain deferred; if there are other redeclarations to carry out on these meth-ods (type of arguments, implementation), they are to be analysed and explicited in anothersubclass.

At this stage, the hierarchy must be reorganized for all methods for which no simple redef-inition of their argument list may be found. Let us take an example:

44 Managing Class Evolution in Object Oriented Systems

ed for both DEQUE and LARGE-DEQUE, and that the First and Last attributes present in the ab-stract class are specialized in its descendents.

7.4 Application

By now, the reader should have a good intuitive understanding of the principles behind our classrestructuring approach. In this section, we highlight the essential decomposition and reorganiza-tion operations that can be applied to a class hierarchy upon the introduction of a new class def-inition. Our restructuring method proceeds in several distinct steps, considering only a limitedrange of class properties at a time. The redefinitions of each property is often broken down intoadditional substeps, as explained below. We try to analyze class structures in a top-down fashion,from their most general to their most detailed characteristics. Thus, we first deal with classes asinterrelated sets of named attributes; we then separate these attributes into public and private at-tributes. In the next step, we deal with the structure of method signatures, and then with the typeof the variables and of the arguments. The last stage of the reorganization deals with method im-plementations.

Inheritance and renaming. This step only considers the inheritance links from the new class toexisting ancestors in the hierarchy. Redefinitions of inherited properties (attributes or inter-faces) may not be immediately applied—except for the renaming of attributes. STACK0 ex-emplifies this reorganization step.

Interface restriction. All inherited attributes that should not appear in the new class’s interfaceare excluded from it—as in the STACK1 class in the previous example—provided some ofthem remain visible. In this case, a supplementary subclass is created with the new interfacedefinition; this subclass is a view of its ancestor. If all inherited attributes become private tothe new class, no additional node is inserted. The hierarchy is reorganized so that all inher-ited attributes that belong to the interface are originally visible in the ancestors; the idea isthat no inherited private attribute may be made public, although any inherited public at-tribute may become private (via a restriction operation). If necessary, the nodes higher upin the hierarchy are reorganized to comply with this constraint.

Class extension. Introduced attributes are inserted into the class definition and those that mustbe part of the class interface are made public. For example, if STACK defined two additionalmethods like Empty and Full to query the status of the stack, these would appear in the in-terface at this point. The implementation of introduced methods is deferred to a lower sub-class. If only introduced attributes are part of the class interface, the relationship with itsancestor is called a modularization. This relationship captures typical inheritance patternslike the following:

class BYTE-FILEexports

Read, Write, Seek, Open, Close;...end ;

class SEQUENTIAL-FILEinherits

BYTE-FILE;

E. Casais 43

First : NODE;Last : NODE;

methodsPut-front (item : OBJECT)

begin-- Add the new item at the front of the deque.

end Put-front;Put-back (item : OBJECT)

begin-- Add the new item at the end of the deque.

end Put-back;Get-front returns OBJECT

begin-- Retrieve the item at the front of the deque.

end Get-front;Get-back returns OBJECT

begin-- Retrieve the item at the back of the deque.

end Get-back;end ;

class DEQUE---- This class implements a fixed sized deque.--inherits

DEQUE0;exports

Put-front, Put-back, Get-front, Get-back;redefines

First, Last, Put-front, Put-back, Get-front, Get-back;variables

First : INTEGER;Last : INTEGER;Counter : INTEGER;Store : ARRAY[OBJECT];

methodsPut-front (item : OBJECT)

begin-- Add the new item at the front of the deque.

end Put-front;Put-back (item : OBJECT)

begin-- Add the new item at the end of the deque.

end Put-back;Get-front returns OBJECT

begin-- Retrieve the item at the front of the deque.

end Get-front;Get-back returns OBJECT

begin-- Retrieve the item at the back of the deque.

end Get-back;end ;

DEQUE and LARGE-DEQUE are now concrete subclasses of the abstract DEQUE0 class. No-tice that the signature of the methods are identical in all classes, that the variable Counter is need-

42 Managing Class Evolution in Object Oriented Systems

Get-front returns OBJECTbegin

-- Retrieve the item at the front of the deque.end Get-front;

Get-back returns OBJECTbegin

-- Retrieve the item at the back of the deque.end Get-back;

end ;

LARGE-DEQUE provides services similar to those of DEQUE, except for their implementa-tion. It is clear that there is an abstract class underlying the description of both DEQUE (whichshould perhaps be called FIXED-SIZE-DEQUE) and LARGE-DEQUE. The reorganization algorithmis able to infer from the redefinitions and rejections performed by the latter class that the sub-classing operation cannot be decomposed into normal inheritance relationships, and that theoriginal hierarchy must be fixed.

class DEQUE0---- This class describes an abstract deque.--exports

Put-front, Put-back, Get-front, Get-back.;variables

First : OBJECT;Last : OBJECT;Counter : INTEGER;

methodsPut-front (item : OBJECT)

begindeferred ;

end Put-front;Put-back (item : OBJECT)

begindeferred ;

end Put-back;Get-front returns OBJECT

begindeferred ;

end Get-front;Get-back returns OBJECT

begindeferred ;

end Get-back;end ;

class LARGE-DEQUE---- This class implements a variable sized deque.--inherits

DEQUE0;exports

Put-front, Put-back, Get-front, Get-back;redefines

First, Last, Put-front, Put-back, Get-front, Get-back;variables

E. Casais 41

beginsuper .Put-front (item);

end Put-front;Pop returns REAL

beginsuper .Get-front;

end Get-front;end ;

Each one of the three abstraction levels corresponds to a consistent, typical utilization ofinheritance:

• STACK0 identifies the origin of the inheritance path and the renaming of properties ac-quired from its ancestors. Renaming operations may hint at an inconsistent terminologyfor specifying classes in the library—contrary to the rules stated in section 5.2—and mayjustify additional adjustments to the class collection.

• STACK1 does not redefine any inherited attribute, but restricts the visible interface to op-erations Push and Pop, so that a deque is actually manipulated as a stack; STACK1 is aview of DEQUE. Notice that since STACK1 does not redeclare any inherited attribute, itactually corresponds to the general definition of a stack, that can store any kind of ob-jects.

• The last class, STACK, is a specialization of the more general STACK1 definition: instanc-es from STACK can only handle objects of type REAL.

As an example of subclassing that warrants a reorganization of the existing hierarchy, con-sider the following situation:

class LARGE-DEQUEinherits

DEQUE;exports

Put-front, Put-back, Get-front, Get-back;redefines

First, Last, Put-front, Put-back, Get-front, Get-back;rejects

Store; -- There is no longer any need for this variable;-- LARGE-DEQUE uses a linked list instead of an array.

variablesFirst : NODE;Last : NODE;

methods---- The code in every method, rather than dealing with a fixed width array,-- takes advantage of the possibility to have nodes of the deque allocated-- dynamically.--Put-front (item : OBJECT)

begin-- Add the new item at the front of the deque.

end Put-front;Put-back (item : OBJECT)

begin-- Add the new item at the end of the deque.

end Put-back;

40 Managing Class Evolution in Object Oriented Systems

Let us suppose that a programmer inherits from this class to build a description for stacksof real numbers:

class STACKinherits

DEQUE;exports

Push, Pop;redefines

Put-front, Get-front, Store;renames

Put-front as Push, Get-front as Pop;variables

Store : ARRAY[REAL];methods

Push (item : REAL)begin

super .Put-front (item);end Put-front;

Pop returns REALbegin

super .Get-front; end Get-front;

end ;

This subclass actually embodies several abstraction steps: a change in the terminology forthe data structure, a change in the interface, and a specialization. To make these various utiliza-tions of inheritance explicit, the decomposition procedure adds two intermediate classes ar-ranged in the following hierarchy:

class STACK0inherits

DEQUE;exports

Push, Pop, Get-back, Put-back;renames

Put-front as Push, Get-front as Pop;end ;

class STACK1inherits

STACK0;exports

Push, Pop;end ;

class STACKinherits

STACK1;exports

Push, Pop;redefines

Push, Pop, Store;variables

Store : ARRAY[REAL];methods

Push (item : REAL)

E. Casais 39

The algorithm is built around three major components that operate on a class definitionnewly installed as a subclass in an existing hierarchy. This new class may contain any kind ofattribute redefinition; it may also reject inherited properties and add its own set of characteristicsto those acquired from its ancestors. The algorithm isolates the various abstraction steps thatcompose the set of redefinitions; for each of these steps, a new class is created, thus making ex-plicit the succession of logical subclassing operations needed to arrive at the new class schema.

When the redefinitions are actually caused by a deficiency in the structure of inherited su-perclasses, the algorithm reorganizes the hierarchy to suppress these defects and ensure that theinheritance graph is only composed of legitimate subclassing relationships. This step is carriedout by a separate procedure that traces unwanted properties (i.e. those that are redeclared be-cause they can neither be inherited nor refined in their present form) up the hierarchy until reach-ing the nodes where they are introduced. For each of these classes, a node is created that containsproperties common to both the superclasses of the initial hierarchy and to the newly introducedsubclass. The inheritance links are then redirected to point from the subclass to these additionalnodes rather than to the previous ancestors. These reorganization operations also result in thecreation of supplementary classes.

A third and final pass is required to suppress redundant nodes from the graph. Basically, allnew classes that do not introduce any additional properties relative to their ancestors, or whichhave only one descendent, are merged with other class definitions.

As an example of the first restructuring step (called the decomposition step), consider thefollowing definition:

class DEQUEexports

Put-front, Put-back, Get-front, Get-back.;variables

First : INTEGER;Last : INTEGER;Counter : INTEGER;Store : ARRAY[OBJECT];

methodsPut-front (item : OBJECT)

begin-- Add the new item at the front of the deque.

end Put-front;Put-back (item : OBJECT)

begin-- Add the new item at the end of the deque.

end Put-back;Get-front returns OBJECT

begin-- Retrieve the item at the front of the deque.

end Get-front;Get-back returns OBJECT

begin-- Retrieve the item at the back of the deque.

end Get-back;end ;

38 Managing Class Evolution in Object Oriented Systems

this case, the class only defines the method signature (name and arguments) and leaves the codeunspecified, to be determined in a subclass. Constructing so-called abstract or deferred classesis a technique widely used in object-oriented programming.

The attributes of a class are either inherited from other classes or introduced by the classitself. Because of multiple inheritance, naming conflicts may occur between attributes bearingidentical identifiers but acquired through different subclassing paths. We suppose that some dis-ambiguating mechanism allows one to distinguish correctly between attributes with similarnames. It is reasonable to assume that attributes introduced in a class take precedence over thosewith the same name defined in the class’s ancestors. The structure deduced from inheritancelinks between classes is restricted to form a directed graph without circuits, so that no class candirectly or indirectly inherit from itself.

A class may redefine the attributes inherited from its superclasses. For each ancestor, andfor each redefined attribute pertaining to this ancestor, there is a specification of how its charac-teristics are overridden in the subclass. For example, it is possible to redeclare the type of a meth-od argument and leave the remaining properties of the attribute unchanged. As with new, intro-duced attributes, a redefinition supersedes the inherited characteristic in the class definition.

Our model also allows inherited attributes to be rejected when a new subclass is added to ahierarchy, in which case they should not appear at all in the subclass definition. Rejecting at-tributes is not really a subclassing operation, but rather a modification of the inheritance schema.No current object-oriented language allows one to reject or to selectively inherit attributes froma superclass; to achieve this result, the hierarchy must be reorganized, by separating rejected andaccepted attributes into different classes. Our approach assumes that the need for such modifi-cations becomes apparent when extending the class hierarchy; it is therefore interesting to inte-grate this evolution pattern with our reorganization algorithm.

The attributes defined by a class are either public, i.e. accessible from the outside of an ob-ject, or private, inaccessible from other objects. Variables are in principle private to an object,but, if needed, they can be manipulated via special accessor methods.The interface of a class de-termines which of its methods are made publicly visible. The interface is not inherited and maybe freely overridden in subclasses.

We use the general term of property to designate any of the characteristics belonging to aclass definition, that is, its attributes, the redefinition performed by the class, its inheritance linksand the specification of its interface. A property may exhibit several facets; for example, a meth-od may be considered from the point of view of its name, its signature, the type of its arguments,its implementation, etc.

7.3 Principle of the algorithm

In order to avoid lengthy developments, we present only a general outline of our algorithm.Some details of its anatomy can be fairly easily inferred from the examples given in this and thenext section; more complete information on its reorganization and simplification components,described below, can be found in [10].

E. Casais 37

non-legitimate uses of inheritance, i.e. those that do not strictly correspond to one of theaforementioned categories.

The reorganization process may create intermediate nodes in the inheritance graph, shuffleattributes among them and rearrange inheritance paths. It is carried out by an algorithm that wedescribe informally in the next pages.

7.2 Model and terminology

Our approach is based on a very simple model that encompasses the major characteristics ofmost object-oriented databases and programming languages. We have deliberately avoided theinclusion of language-specific constructs closely matching those of actual systems in order toavoid compromising the generality of our methodology.

A class defines a set of attributes, which can be either variables or methods. Variables arerepositories for information hidden inside objects. A variable has a name and a type, which cor-responds to the name of a class. The idea is that only objects from the specified class (or com-patible with that class) may be assigned to the variable. We assume that the typing mechanismis integrated with class relationships, although it may not necessarily be grounded on it1. Prim-itive types like INTEGER or BOOLEAN have therefore an associated class.

Methods form the executable part of an object; they correspond to functions and proceduresin traditional programming languages. A method is identified by its name; it may define a list ofarguments that must be supplied to the method when it is invoked. Arguments are identified bytheir name and have an associated type, that, as with variables, corresponds to a class in the sys-tem. Arguments may be classified into input parameters—that only serve to provide some infor-mation to the method—output parameters—that serve to return results to the invoker of themethod—and input-output parameters—objects that are exchanged between the calling and thecalled method, and that can be modified freely by the latter. Since operations on objects are per-formed by sending messages, it is often not possible to determine whether an argument belongsto the input, output or input-output parameters in the absence of any explicit declaration in meth-od headers; this would require an analysis of the side-effects of all messages that can be sent toan argument. Current object-oriented programming languages frequently make a distinction be-tween the arguments supplied to a method and the result returned by it. In this case, it is naturalto presume that arguments are all input-output objects, while the result is just an output value.

The code that a method executes upon invocation is called its implementation. A methodmay call other methods or refer to the variables defined in its class. For each implementation, weassociate the list of methods and the list of variables it depends on for its proper execution. Bydefault, the names appearing in these lists refer to the attributes defined in the class the methodand its implementation belong to. If it is necessary to override scoping constraints to refer di-rectly to a method or a variable defined in a superclass, we assume that the attribute identifiercan be prefixed with the superclass name. The implementation of a method may be deferred; in

1. CLOS constitutes a good example of this possibility: any traditional LISP object, like a list or an integer,can be tested for membership to a class, thanks to a common typing scheme for fundamental LISP typesand user-defined classes. However, the definition and manipulation of objects from both categories are car-ried out by totally distinct mechanisms.

36 Managing Class Evolution in Object Oriented Systems

From this point of view, the traditional separation between development and maintenance activ-ities loses much of its relevance. The goal then is to deal with the imperfections of a class hier-archy. These imperfections may stem directly from the coding activities of a software developerwho is producing new unstructured programs from a well-designed class library. Alternatively,the defects may be already present in the hierarchy, but are only revealed when attempts at classreuse fail or meet with considerable obstacles in spite of a careful modelling of the new applica-tion. A good reorganization methodology should detect the places in a class hierarchy warrant-ing redesign and propose better ways to structure dubious class definitions.

We propose a new approach for automating precisely these tasks. We emphasize inherit-ance and the structuring of classes and class attributes as the essential abstraction mechanismsthat drive the modelling of applications and the constitution of class hierarchies. This is in linewith the general idea of object-oriented programming, which assumes that a great part of soft-ware development rests on the reuse capabilities afforded by inheritance. In contrast, the Law ofDemeter focuses on the calling and dependency patterns between methods.

Our fundamental hypothesis is that design flaws can be uncovered at the time an object de-scription is added to a hierarchy. The new class may refine, override or eliminate altogether theproperties inherited from its ancestors—which were found in the existing library and supposedlyencapsulate some interesting functionality in a reusable form. The redefinitions may correspondto:

• Inadequate application of object-oriented mechanisms for deriving the new subclass.

• Defective structures in the library of reusable components, which hinder the incorpora-tion of properties inherited from superclasses into new applications.

As Johnson and Foote note [25], a frequent problem lies in the fact that programmers oftenoverlook intermediate abstractions needed for establishing clean inheritance dependencies, anddevelop components too specialised to be effectively reusable. The consequences of early classspecialization are often manifest when modules are created, and become acute when other peo-ple try to reuse these same modules. By extracting as much information as possible from the in-heritance and redefinition patterns of classes, one should be able to improve libraries of softwarecomponents and gain insight into the modelling of class hierarchies.

The analysis of the tailoring operations proceeds in several stages:

• All redefinitions of inherited properties are decomposed into more primitive operations,which apply only to a limited class characteristic. For example, the redefinition of amethod signature can be broken down into the renaming of the arguments and into theredeclaration of their type.

• Typical redefinition patterns are assigned to individual categories. Such categories rep-resent typical ways for deriving new classes from existing ones; as such, they correspondto various semantics of the general inheritance mechanism.

• According to each of these categories, the new class definition is adjusted to make all ab-straction steps explicit. The hierarchy is also reorganized so as to eliminate the need for

E. Casais 35

effective for untyped languages: since objects manipulated by the language expressions are notdeclared to belong to a certain class, it is not possible to determine the conformity of programsto the law by a static examination of their source code. Violations of the law can only be moni-tored and detected during program execution. This is why in practice the “class form” of the lawis more tractable as a framework for checking expressions at compile-time, while the “objectform” appears more intuitive for understanding the intent of the law and for correcting the short-comings of the “class form”.

As far as typed languages are concerned, applying the Demeter principles is not alwaysstraightforward either. First, there are some pathological cases where the spirit of the law is vi-olated, although all dependencies and message-passing patterns formally respect all Demeterrules stated on section 6.2. Fortunately, such anomalies are rare and occur only in very contrivedsituations. More importantly, the law requires significant enhancements and reformulation tohandle language peculiarities correctly. Smalltalk allows classes and metaclasses to be targetsfor message-passing instructions, so these entities could presumably be considered to be pre-ferred-suppliers of all methods referring to them. Eiffel implements a form of genericity whereclasses can bind type arguments inherited from their ancestors. Thus, for the generic classLIST[T], LIST[BOOK] is a descendent that binds the generic argument T to the class BOOK. It wouldbe perfectly legitimate to view BOOK as a preferred-acquaintance or even a preferred-supplierof LIST[BOOK] methods. In the case of C++, the law of Demeter has to deal with mechanismslike friend functions or type casts. It must also take into account the fact that the objects and theconstructs of the language do not exhibit equivalent properties. Basic data types like integers andcharacters, or structures and arrays do not behave like user-defined classes and may not be ma-nipulated with the same primitives. For example, arrays are not explicitly created by invoking aconstructor operation. Although appropriate adaptations have been proposed for C++, translat-ing the law of Demeter into equivalent terms for a specific object-oriented language is not al-ways a trivial task.

It must be noted that some of the issues considered by the law are already handled by themechanisms of existing programming languages. In CLOS, generic reader and writer functionsfor accessing variables can be automatically generated when a new class is declared. With C++,the methods and variables of a class are separated into three categories with different scopingconstraints. Public attributes are visible to all clients of a class; protected attributes are only ac-cessible to its subclasses; private attributes are not visible outside a class definition. These con-structs greatly help in enforcing the encapsulation rules advocated by the Demeter approach, butthey are insufficient for coping with the other dependency problems or disciplining the use ofobject-oriented mechanisms; this is precisely why programming style guidelines and designmethodologies like the law of Demeter are useful.

7. Advanced Class Reorganization Techniques

7.1 Issues

Several authors point out that improper class modelling is a common phenomenon and considerthe recasting of class hierarchies as an unavoidable aspect of the design process [17][23][38].

34 Managing Class Evolution in Object Oriented Systems

return (self .Mn ());end New-Mn;

end ;

These classes, except for C0 and Cn, do not yet conform to the law of Demeter. For each ofthem, we must determine whether the return value of the Mi method is an instance of a preferred-supplier class of Ci. We must consider two cases:

• Mi returns a variable pertaining to Ci. Since xi+1 identifies a component of Ci, the methodNew-Mi conforms naturally to the law of Demeter.

• Mi does not return a variable belonging to Ci. New-Mi must then be recoded so that Xi+1

becomes a preferred-supplier object of Ci. This is achieved by introducing an auxiliarymethod—as was done in the example of the previous section.

class Cimethods

New-Mi returns Cnbegin

return (self .Suppl-Mi (self .Mi ()));end New-Mi;

Suppl-Mi (Xi+1 : Ci+1) returns Cnbegin

return (Xi+1.New-Mi+1 ());end Suppl-Mi;

end ;

This description omits a number of elements that must be taken into account in more real-istic situations. In particular, arguments passed with a message must also be checked for con-formance to the law, since they might be the result of evaluating nested expressions involvingnon-preferred-supplier objects.

The other reorganization techniques available in the Demeter framework—”pushing” and“lifting”—produce interesting class structures, that are generally more intuitive than the system-atic replacement of dubious dependencies with forwarding calls to auxiliary methods. Unfortu-nately, these techniques cannot be easily automated.

6.5 Evaluation

In spite of the fact that it can be only partly automated, the Law of Demeter provides a soundand useful conceptual framework for disciplining object-oriented programming and improvingclass design. Its application has revealed to be beneficial for guiding the design of modular ob-ject-oriented libraries. But, as Sakkinen points out, putting the law into practice raises severaldifficulties [45].

The Demeter rules cannot be completely enforced at compile-time with languages that al-low message-passing expressions to be generated dynamically (for example, with LISP macros)or to be given as arguments (i.e. the method to invoke on an object is not known a priori), or thatallow the introduction or redefinition of methods at run-time. Languages like LISP-CLOS,Smalltalk and Objective-C fall, to a various extent, into this category. Object aliasing (i.e. refer-ring to the same entity through different names) also complicates the checking of expressionsfor conformance to the law. In general, the “class form” law of Demeter does not seem to be fully

E. Casais 33

beginx2 : C2;x3 : C3;...xn : Cn;x2 := self .M1 ();x3 := x2.M2 ();...xn := xn-1.Mn-1;return (xn);

end New-M1;end ;

By an analogous transformation, we eliminate all method invocations on objects x3 to xn

from C1’s definition:

class C1methods

New-M1 returns Cnbegin

x2 : C2;x2 := self .M1 ();return (x2.New-M2 ());

end New-M1;end ;

class C2methods

New-M2 returns Cnbegin

x3 : C3;...xn : Cn;x3 := self .M2 ();x4 := x3.M3 ();...xn := xn-1.Mn-1;return (xn);

end New-M2;end ;

We repeat this process until classes Ci=1..n-1 are of the form:

class Cimethods

New-Mi returns Cnbegin

xi+1 : Ci+1;xi+1 := self .Mi ();return (xi+1.New-Mi+1 ());

end New-Mi;end ;

and class Cn is:

class Cnmethods

New-Mn returns Cnbegin

32 Managing Class Evolution in Object Oriented Systems

...methods

Get-Title returns STRINGbegin

return (Title);end Get-Title;

...end ;

The Demeter approach provides still another reorganization technique, the “lifting” meth-od, but it can be applied only in rarer circumstances. The reader should refer to [33] for moredetails about this technique.

6.4 Mechanical reorganization

In [32], the authors sketch an algorithm for mechanically transforming programs that do not con-form to the law into an equivalent legal form. A first trivial adaptation consists in replacing alldirect references to inherited variables and to subcomponents of a variable by calls to specializedaccessors in charge of retrieving or setting their value. All other violations are assumed to occurwithin nested expressions, when objects returned by a method become the target for further mes-sages, as in the class definition below:

class C0methods

M0 returns Cn begin

x2 : C2;x3 : C3;...xn : Cn;---- X1 of class C1 is an argument of M0 or a variable of C0.--x2 := X1.M1 ();x3 := x2.M2 ();...xn := xn-1.Mn-1 ();return (xn);

end M0;end ;

In a first step, the responsibility for carrying out the M1 method is delegated to class C1.Thus, C0 now satisfies the law, since it no longer invokes a method on the object returned by M1;remember that this object is not a preferred supplier of M0.

class C0methods

M0 returns Cn begin

return (X1.New-M1 ());end M0;

end ;

class C1methods

New-M1 returns Cn

E. Casais 31

return (microfiche-refs.Merge (cd-rom-refs));end Merge-refs;

end ;

The definition of CATALOG now fully complies with the law. Search-book no longer manip-ulates objects of type LIST[BOOK], but instead passes them to Merge-refs for processing. Becausethe latter method declares microfiche-refs and cd-rom-refs as arguments, it may invoke methods onthem freely; LIST[BOOK] is effectively a preferred-supplier class of Merge-refs. The introductionof such a method may appear artificial, but it makes explicit the dependency between the CATA-

LOG and LIST[BOOK] classes. The programmer is informed about these dependencies at the levelof method signatures, while in the previous situation he had to delve into the code of Search-book

to detect them.

The last transformation concerns the implementation of Search-book in CD-ROM. Ensuringthe proper encapsulation of BOOK objects requires that their variables which are publicly visiblebe manipulated through special-purpose procedures like the slot accessors of CLOS. It is easyto derive the appropriate adjustments to the definitions of CD-ROM and BOOK that enforce thisrule. According to the strictest interpretation of the law, BOOK objects cannot be considered pre-ferred-suppliers of CD-ROM. A technique similar to the one described above for CATALOG sup-presses this inconsistency, although it produces a somewhat cumbersome program structure.Conversely, it is perfectly legal to let Search-book manipulate the object books-found, of typeLIST[BOOK], since this object is actually created by the method itself.

class CD-ROMvariables

Book-References : FILE[BOOK];...

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginbooks-found : LIST[BOOK];books-found.New ();Book-References.First ();loop

exit when Book-References.End ();if title.Equal (self .RefTitle (Book-References.Current ())) then

books-found.Add (Book-References.Current ());end-if ;Book-References.Next ();

end loop ;return (books-found);

end Search-Book;RefTitle (reference : BOOK) returns STRING

beginreturn (reference.Get-Title);

end GetRefTitle;...

end ;

class BOOKvariables

Title : STRING;SubTitle : STRING;Authors : LIST[STRING];

30 Managing Class Evolution in Object Oriented Systems

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginreturn (Catalog.Search-book (title));

end Search-book;...

end ;---- The definition of LIBRARY is now correct.--

class CATALOGvariables

Microfiches : MICROFICHE;Optical-Disk : CD-ROM;...

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginbooks-found : LIST[BOOK];books-found := Microfiches.Search-book (title);books-found.Merge (Optical-Disk.Search-book (title));return (books-found);

end Search-book;...

end ;---- The definitions of CD-ROM, MICROFICHE and BOOK remain unchanged.--

This organization is still not adequate: the CATALOG class manipulates objects, returned bythe Search-book methods of Microfiches and Optical-Disk, that do not belong to it. Moreover, thetype of these objects does not correspond to a preferred-supplier class of CATALOG. In this case,it is rather difficult to delegate further the processing of searches for references to the compo-nents of CATALOG: merging the results of several sub-queries should remain the responsibilityof the catalog object. On the other hand, this cannot be done without accessing LIST[BOOK] en-tities, which is in principle incorrect.

The Law of Demeter proposes to deal with such situations by making use of an additionalmethod:

class CATALOGvariables

Microfiches : MICROFICHE;Optical-Disk : CD-ROM;...

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginreturn (self .Merge-refs (Microfiches.Search-book (title),

Optical-Disk.Search-book (title)));end Search-book;

...Merge-refs (microfiche-refs : LIST[BOOK]; cd-rom-refs : LIST[BOOK])

returns LIST[BOOK]begin

E. Casais 29

Book-References.First ();loop

exit when Book-References.End ();book := Book-References.Current ();if title.Equal (book.Title) then

books-found.Add (book)end-if ;Book-References.Next ();

end loop ;return (books-found);

end Search-Book;...

end ;

class MICROFICHEvariables

Book-References : FICHES[BOOK];...

methodsSearch-book (title : STRING) returns LIST[BOOK]

begin...

end Search-book;...

end ;

class BOOKvariables

Title : STRING;SubTitle : STRING;Authors : LIST[STRING];...

methods...

end ;

The definition of class LIBRARY obviously does not conform to the law: method Search-

book accesses internal components of Catalog—the variables Microfiches and Optical-Disk; it sendsmessages to these variables and receives as result objects that are not subparts of LIBRARY in-stances and whose type—LIST[BOOK]—is not a preferred-supplier class of LIBRARY. We alsonote that the algorithm for retrieving all references stored on the optical disk violates elementaryencapsulation constraints: it manipulates the internal structure of books to find whether their titlematches a specific search criterion.

It is clear that the details of scanning microfiche and CD-ROM files to find a particular ref-erence should be delegated to the CATALOG class. This would make the querying methods of LI-

BRARY immune to alterations in the internal structure of the catalog—for example the replace-ment of the microfiches with an additional CD-ROM file. A first transformation (called “push-ing” by Lieberherr et al.) allows us to get rid of this problem.

class LIBRARYvariables

Catalog : CATALOG;Loans : LOAN;...

28 Managing Class Evolution in Object Oriented Systems

sponding “object form” forbids accessing variables that are not introduced by the class wherethe method making the reference is itself introduced.

In all versions of the law, the basic goal remains the same: a method should only access at-tributes which are closely related to it or to the class it belongs to. Typically, a method shouldnot traverse the structure of an object to manipulate its internal variables. It should not accessvariables pertaining to its class definition when they are actually inherited from a superclass. Itshould not directly send messages to foreign objects created by another method. The eliminationof these dependencies promotes information hiding and restricts the visible properties of classesto narrow interfaces. As a consequence, the degree of coupling between classes and the size ofthe methods diminish—thus facilitating the application of program validation techniques, in-creasing the chances for reuse, and simplifying software maintenance.

6.3 Application and examples

We illustrate the main reorganization aspects dealt with by the Demeter approach for a group ofsimple object descriptions. The example is an adaptation of the case discussed in [32].

Let us assume the following definitions:

class LIBRARYvariables

Catalog : CATALOG;Loans : LOAN;...

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginbooks-found : LIST[BOOK];books-found := Catalog.Microfiches.Search-book (title);books-found.Merge (Catalog.Optical-Disk.Search-book (title));return (books-found);

end Search-book;...

end ;

class CATALOGvariables

Microfiches : MICROFICHE;Optical-Disk : CD-ROM;...

methods...

end ;

class CD-ROMvariables

Book-References : FILE[BOOK];...

methodsSearch-book (title : STRING) returns LIST[BOOK]

beginbook : BOOK;books-found : LIST[BOOK];books-found.New ();

E. Casais 27

6. Restructuring Inter-Attribute Dependencies

6.1 Issues

The list of reorganization rules exposed in the previous section clearly shows that avoiding un-necessary coupling between object definitions and reducing inter-attribute dependencies are twoimportant prerequisites for well-designed classes. This is clearly justified from a software engi-neering point of view, since tightly encapsulated classes are easier to reuse and to maintain. Twomajor issues have to be addressed:

• What are the inferior or “harmful” dependencies?

• How can unsafe expressions be automatically replaced with adequate constructs?

This problem has been studied by Lieberherr et al. [32][33]; the results of their investiga-tions have been condensed in a short design principle known as the Law of Demeter, and in asmall set of techniques for transforming object definitions so that they comply with this law.

6.2 The Law of Demeter

The Law of Demeter distinguishes three types of inter-attribute dependencies, and three corre-sponding categories of relationships between class definitions:

• A class C1 is an acquaintance class of method M in class C2, if M invokes a method de-fined in C1 and C1 does not correspond to the class of an argument of M, to the class ofa variable of C2, to C2 itself, or to a superclass of the aforementioned classes.

• A class C1 is a preferred-acquaintance class of method M in C2, if C1 corresponds to theclass of an object directly created in M or to the class of a global variable used in M.

• A class C1 is a preferred-supplier class of method M in C2, if M invokes a method definedin C1, and C1 corresponds to the class of a variable of C2, or to the class of an argumentof M, to C2 itself, to a superclass of the aforementioned classes, or to a preferred-acquain-tance class of M.

These definitions constitute the basis for what is called the “class form” of the law, whichstates that methods may only access entities belonging to their preferred-supplier classes. An al-ternative, more flexible formulation of the law simply aims at minimizing the number of ac-quaintance classes over all methods.

The so-called “object form” of the law does not consider the classes a method depends on,but rather the objects this method sends messages to. In this context, a preferred-supplier objectis an instance that is either a variable introduced by the class where the method is defined, anargument passed to the method, an object created by the method, a global variable, or the pseu-do-variable self (identifying the object executing the method). The “object form” of the law pro-hibits references to instances that are not preferred-suppliers of a method.

In its weak version, the law considers the classes of inherited variables (or the variablesthemselves, in the “object form” of the law) as legitimate preferred-suppliers. The strict versiondoes not consider the classes of inherited variables as legitimate preferred-suppliers; the corre-

26 Managing Class Evolution in Object Oriented Systems

• Minimize the accesses to variables so as to reduce the dependency of methods on the in-ternal class representation. This can be achieved for example by resorting to special ac-cessors instead of referring directly to variables.

• Ensure that inheritance links match specialization relationships. If necessary, revert classdependencies to make the hierarchy conform to this constraint. For example, ELLIPSE

should be a superclass of CIRCLE, even if it is possible (and perhaps advantageous forimplementation purposes) to install it as a descendent of CIRCLE in the hierarchy [23].

We end with the rules used to modularize the functionality of object definitions.

• Split large classes into smaller classes.

• Factor out implementation differences into subclasses, or via multiple inheritance, into“mixins”; if possible, delegate functionality to other objects.

• Separate groups of methods that do not communicate. Such sets of methods represent ei-ther totally independent behaviour or different views of the same object, which are per-haps better represented by distinct classes.

• Decouple methods from global attributes or internal class properties by sending messag-es to parameters instead of to self or to instance variables.

These design principles are obviously intertwined; for example, the methods derived fromthe decomposition of a large procedure most probably require less arguments than the originalroutine. Similarly, suppressing test cases diminishes the size of methods. Factoring out common-alities in superclasses results in smaller subclass descriptions, with fewer redundant declara-tions.

5.3 Evaluation

All these guidelines are of course very general. They provide no accurate criteria to drive therestructuring process: the software maintainer is left on his own to detect the situations warrant-ing a reorganization of the hierarchy and the best way to accomplish it. An important require-ment is that the reorganization of a class should affect its clients as little as possible; this imposesadditional constraints on the kind of modifications that may be safely applied to object defini-tions and to the inheritance graph. Some authors propose using a shortened description of a class(excluding details about method implementation) as a starting point for generalization [38]. An-other technique consists in associating pre and post-conditions to each method and then analys-ing these invariants to discover abstraction opportunities.

Fortunately, some of the rules appearing in this informal reorganization framework areamenable to a rigorous formulation and a subsequent automation. Software developers can thustake advantage of both the informal, but extensive framework—which gives them considerablefreedom for choosing class evolution paths, and its detailed formalization—with its facilities forin depth analysis of certain choices.

E. Casais 25

• Factor out behaviour common to several classes into a shared superclass. Introduce ab-stract classes (with deferred methods) if convenient to avoid attribute redefinitions.

Handling class differences by explicit type testing:

class Avariables

V : OBJECT;methods

Mbegin

...if V.Class () = B then

V.Method-X (); -- do something;elseif V.Class () = C then

V.Method-Y (); -- do something completely different;end if...

end M;end ;

Delegating class-specific code:

class Avariables

V : OBJECT;methods

M begin

...V.TestClass ();...

end Mend ;

class Bmethods

TestClassbegin

self .Method-X (); -- do something;end TestClass;

end ;class C

methodsTestClass

beginself .Method-Y (); -- do something completely different;

end TestClass;end ;

Figure 1 Reorganizing classes to avoid explicit type testing.

24 Managing Class Evolution in Object Oriented Systems

ular, narrow application. Inheritance relationships consistently based on specializationfacilitate the analysis of the hierarchy and the modelling in terms of existing classes—asopposed to the situation where inheritance is used for code sharing or implementationpurposes.

• The modularization of functionality. Small, cohesive classes are more resilient to change,while large classes grouping loosely related functions are difficult to adapt and utilize inother programs.

Several practical rules corresponding to each of these categories have been proposed [25].As far as interface standardization is concerned, they are as follows:

• Adopt a uniform terminology for classes belonging to a similar group of concepts.

• Standardize the interface of classes that communicate with each other.

• Decrease the number of arguments of a method, either by splitting the method into sev-eral simpler procedures, or by creating a class to represent a group of arguments that ap-pear often together. A method with a reduced number of parameters is more likely to beara signature similar to some other method in a different class; both methods may then begiven the same name, thus increasing interface standardization.

• Eliminate code that explicitly checks the type of an object. Rather than introducing largecase statements to execute some actions on the basis of an object’s class, one should in-voke a standard message in the object and let it carry out the appropriate actions, as inthe example in Figure 1

The next set of rules is used to find more general classes and introduce higher-level abstrac-tions into the hierarchy.

Interface STACK ARRAY QUEUE H-TABLE

Old versionpush enter add insertpop remove_oldest deletetop entry oldest value

New versionput put put put

remove remove removeitem item item item

Table 4 Class interfaces for some basic data structures of the Eiffel library. The conven-tions adopted in the second version of the language have resulted in greaternaming consistency across object definitions.

E. Casais 23

Tools that automatically restructure a class collection and suggest alternative designs canreduce the efforts required to improve a class hierarchy. The available methods range from in-formal guidelines for detecting and then correcting class deficiencies, to algorithmic approachesthat adjust object definitions on the basis of structural criteria. As we show in the following sec-tions, all these techniques face significant difficulties because of the lack of a clear supportingdesign model and because of the very general applicability of object-oriented mechanisms, no-tably inheritance.

5. Empirical Guidelines

5.1 Issues

The object-oriented approach aims at dramatically increasing programmer productivity by rely-ing on the massive reuse of numerous pre-packaged software components. The experiencegained with Smalltalk demonstrates that this goal can indeed be achieved, provided that the de-sign of objects fulfils some important requirements:

• The application domain must be adequately represented in terms of objects.

• Standardized class interfaces must be adopted to allow polymorphism.

• Classes must encapsulate behaviour general enough to serve in many applications.

• The functionality must be properly decomposed into the various levels of a class hierar-chy.

Object definitions exhibiting these desirable properties are easier to combine and refine forbuilding new applications or other software components.

Since software components cannot be assumed to be perfectly reusable the first time theyare built, developers of object-oriented libraries must be supplied with tools to assess the qualityof an inheritance hierarchy, to spot design imperfections, and to improve class definitions. Someauthors propose general, informal principles for carrying out reorganization tasks. These princi-ples are based on the lessons drawn from the building of large collections of reusable classes,like the Smalltalk hierarchy, the Eiffel library, the ET++ class collection [49], etc.

5.2 General redesign rules

According to Johnson and Foote [25], a reorganization methodology should deal with three maincharacteristics of a class hierarchy:

• The standardization of class interfaces. Objects answering the same set of messages areeasier to combine. The structure of the inheritance hierarchy is simpler and more under-standable—since one may assume that definitions with identical interfaces satisfy simi-lar properties. Common class interfaces allow polymorphism—objects with equivalentinterfaces may be substituted with each other in a type safe way. Table 4 illustrates theresult of applying this principle to the basic data structures of the Eiffel library.

• The building of an abstract, coherent class hierarchy. Abstract classes exhibit a higherdegree of reusability—because they define general properties and are not tied to a partic-

22 Managing Class Evolution in Object Oriented Systems

by fully automatic means. For every new version, one must program special functions for map-ping between old and new class structures. These functions filter the messages sent to objects,so that proper actions can be taken, like translating between method names, returning a defaultvalue when accessing a non-existent variable, or simply aborting an unsuccessful operation. Wedescribe in more detail the functionality of such systems in the part devoted to update propaga-tion.

4.5 Evaluation

Versioning is a particularly appealing approach for managing class development and evolution.Recording the history of class modifications during the design process brings several benefits: itenables the programmer to try different paths when modelling complex application domains,and it helps avoid confusion when groups of people are engaged in the production of a libraryof common, interdependent classes. Versioning also appears useful when keeping track of vari-ous implementations of the same component for different software environments and hardwareplatforms. Besides, the hierarchical decomposition of the programming environment into work-spaces, the attribution of precise responsibilities to their administrators, and the possibilities af-forded by this kind of organization (for example, the separation of the long-term improvementof reusable components from the short-term development of new applications) are considered tobe particularly valuable for increasing the quality and efficiency of object-oriented programming[47].

The main drawback of versioning techniques resides in the considerable overhead they im-pose on the development environment. Programmers have to navigate through two interconnect-ed structures, the traditional inheritance hierarchy and the version derivation graph. They haveto take into account a greater set of dependencies when designing a class. The system must storeall information needed for representing versions and their reciprocal links, and implement noti-fication. Moreover, version management methodologies still lack some support for design tasks:at what point does a version stop to be a variant of an existing class to become a completely dif-ferent object definition?

In spite of their overhead, class and object versioning techniques have proved invaluable inimportant application domains like CAD/CAM, VLSI design and Office Information Systems.They have therefore been integrated in several object-oriented environments, including Orwell[47], AVANCE [6], ORION [4] and IRIS [18].

CLASS REORGANIZATION

In an object-oriented environment, programmers are supposed to build applications chiefly in abottom-up fashion, by reusing existing classes. Classes often require adaptations so that they ful-ly suit the needs of software developers. Such modifications indicate that the current hierarchyis not satisfactory: if software components cannot be reused as they are, then one is well-advisedto look for missing abstractions, to try making some classes more general, to increase modular-ity.

E. Casais 21

system defines another class, Checkpointable, that provides reduced, lower-level versioning func-tionality.

4.4 Versioning and class evolution

The derivation of class versions is partly automatic and partly the result of user decisions. It isevidently impossible to delegate full responsibility to the system for determining when a tran-sient version should be frozen and a new transient one created, or if a component is sufficientlypolished to be released. Such actions must be based on design knowledge which is best masteredby the software developers themselves. The automatic generation of new versions triggered, forexample, by update operations on object definitions is a scheme that has found limited applica-tion in practice.

Another difficulty arises because of the superimposition of versioning on the inheritancegraph. For example, when creating a new variant for a class should one derive new versions forthe entire tree of subclasses attached to it as well? A careful analysis of the differences betweentwo successive versions of the same class gives some directions for handling this problem [7].

• If the interface of a class is changed, then new versions should be created for all classesdepending on it, whether by inheritance (i.e. its subclasses) or by their instance variables(i.e. classes using variables whose type refers to the now modified definition).

• If only non-public parts are changed, like the methods visible only to subclasses (suchmethods are called “protected methods” in C++), the type of its variables, or its inherit-ance structure, then versioning can be limited to its existing subclasses.

• If only the realizations of methods are changed, no new versions for other classes are re-quired; this kind of change is purely internal and does not affect other definitions.

For reasons analogous to those exposed above, some approaches prefer to avoid introduc-ing a possibly large number of new versions automatically and rely instead on a manual proce-dure for reestablishing the consistency of the inheritance hierarchy. The users whose programsreference the class that has been updated are simply notified of the change and warned that thereferences may be invalid. Two strategies are commonly adopted to do this: either a message isdirectly sent to the user, or the classes referencing the modified object definition are tagged asinvalid. In the latter case, class version timestamps are frequently used to determine the validityof references [11]. Thus, a class should never have a “last modification” date that exceeds the“approved modification” date of the versions it is referred by. When this situation occurs, thereferences to the class are considered inconsistent, since recent adaptations have been carried outon the component, but have not yet been acknowledged on its dependent classes. It is up to theprogrammer to determine the effects of the class changes on other definitions and to reset theapproved revision timestamp to indicate that the references have become valid again.

Maintaining compatibility between entities belonging to different versions is a major issue,and an object-oriented system should provide support for dealing with this aspect of versionmanagement. Application developers may want to view objects instantiated from previous classversions as if they originated from the currently stable version, or they may want to forbid ob-jects from old versions to refer to instances of future variants. These effects are seldom achieved

20 Managing Class Evolution in Object Oriented Systems

When the version number is absent from a reference, a default class is assumed. Typicalchoices for resolving the dynamic binding of version references include:

• The very first version of the class referred to.

• Its most recent version. The idea behind this decision is that this version can be consid-ered the most up-to-date definition of a class. This is a solution commonly used in object-oriented databases to bind version references in interactive queries.

• Its most recent version at the time the component making the reference was created. Thisis the preferred option for dealing with dynamic references in class specifications.

• A default class definition specified by the administrator in charge of the context. Thisdefinition, called a generic version, can be coerced to be any element in a version deri-vation history.

The default version is first searched for in the context where the reference is initially dis-covered to be unresolved; the hierarchy of contexts is then inspected upward until finding theappropriate definition. Thus, to bind an incomplete reference to a class made in a project context(i.e. a reference consisting only in the class name, without additional information), the systemfirst examines the class hierarchy in the current context; if this context does not contain the classdefinition referred to, the search proceeds in the public repository. No private context is inspect-ed, for stable versions are not allowed to refer to transient versions, which can be in the processof being revised. Similarly, dynamic references to classes in the public context cannot be re-solved by looking for unreleased components in a project context. Naturally, dynamic bindingcan be resolved at the level of a private context for all classes pertaining to it.

If only the most recent version gives rise to new versions, there is in principle no need fora complex structure to keep track of the history of classes: their name and version number sufficeto determine their relationship to each other. The situation where versioning is not sequential,i.e., where new versions derive from any previous version, requires that the system record a hi-erarchy of versions somewhat similar to the traditional class hierarchy. When a version is copiedor installed in a context, the programmer decides where to connect it in the derivation hierarchy.AVANCE provides an operation to merge several versions of the same class; with this scheme,the derivation history takes the form of a directed graph without circuits [7].

The information on derivation dependencies is generally associated with the generic ver-sion of a class version set. Version management systems like IRIS or AVANCE implement a se-ries of primitives for traversing and manipulating the derivation graph [5][7]. Programmers canthus retrieve the predecessors and the successors of a particular version, obtain the first or themost recent version of a class on a particular derivation path, query their status (transient, re-leased, date of creation, owner), determine which version was valid at a certain point in the pastand bind a reference to it, freeze or derive new versions, etc.

The management of versions and related data obviously entails significant storage and pro-cessing overhead. This is why in most systems one is required to explicitly indicate that classesare versionable by making them descendents of a special class from which they inherit theirproperties of versions. This class is often called Version, as in AVANCE and IRIS; the former

E. Casais 19

sient versions. Each programmer individually updates these classes and perhaps creates otherdefinitions (via usual subclassing techniques) in his context as additional transient versions. Inorder to try different designs for the same class, or to save the result of his programming activity,he may derive new transient versions from those he is currently working on, while simultaneous-ly promoting the latter to working versions. When he achieves a satisfactory design for a soft-ware component, he installs it as a working version in the project context. Of course, these work-ing versions can subsequently be copied by his colleagues and give rise to new transient versionsin their respective environments. Once software components have reached a good stage of ma-turity in terms of reliability and design stability, they are released by the project administratorand made publicly available in the central repository.

Since all operations of version derivation and freezing are done concurrently, careful algo-rithms are required to ensure that the system remains consistent. Fortunately, all updates are ap-plied to local, transient objects, and not directly to global, shared definitions. As a consequence,concurrency control does not have to be as elaborate as traditional database transaction mecha-nisms. Some systems, like IRIS, implement simple schemes enabling programmers to preventother users from deriving new versions from an existing object by setting a lock on it [18].

4.3 Version identification

An essential problem to deal with concerns the identity of classes. It is no longer enough to referto a software component by its name, since it might correspond to multiple variants of the sameclass. An additional version number, and possibly a context name, must be provided to identifyunambiguously the component referred to.

Transient Working Released

Location public context 3project contexts 3private contexts 3 3

Admissible operationsupdate 3delete 3 3

Originfrom a transient version by derivation promotionfrom a working version by derivation promotionfrom a released version by derivation

Table 3 Principal characteristics of version types. Some systems consider only two kindsof versions (transient and released) and two levels of contexts (private and pub-lic) for managing their visibility.

18 Managing Class Evolution in Object Oriented Systems

• What are the circumstances that justify the creation of new versions, and how is this op-eration carried out?

• What can be done to handle the relations between different and perhaps incompatibleversions?

It should be noted that versioning can be applied at the level of instances as well. In fact,many approaches, derived from techniques used notably in the field of CAD/CAM, originallyconsidered versioning for objects and not for class definitions. Since we are mainly interested inclass design and evolution, we will not deal with this possibility. Furthermore, managing in-stance versions and class versions are quite similar tasks, so most of the techniques we describein this article find their use in both domains.

4.2 The organization of version management

An environment for version management is divided in several distinct working spaces, each oneproviding a specific set of privileges and capabilities for manipulating different kinds of versions[11]. Three such contexts are generally recognized in the literature:

• A private working space supports the design and development activities of one program-mer. The programmer acts as the administrator of his private space. The informationstored in his private environment—in particular the software components he is currentlydesigning or modifying—is not accessible to other users.

• All classes and data produced during a project are stored in a corresponding context,which is placed under the responsibility of a project administrator. They are made avail-able to all people cooperating in the project, but remain hidden from other users, sincethey cannot yet be considered as tested and validated.

• A public context contains all released classes from all projects, as well as data on theirstatus. This information is visible to all users of the system.

It is natural to associate one kind of version with each working space:

• Released versions appear in the public context. They are considered immutable and cantherefore neither be updated nor deleted, although they may be copied and give rise tonew transient versions.

• Working versions exist in project and possibly private contexts. They are considered sta-ble and cannot be modified—but they can be deleted by their owner, that is, the projectadministrator or the user of a private context. Working versions are promoted to releasedversions when they are installed in the public repository; they may give rise to new tran-sient versions.

• A transient version is derived from a released, a working or another transient version. Itbelongs to the user who created it, and it is stored in his private environment. Transientversions can be updated, deleted, and promoted to working versions.

A typical scenario begins when a project is set up to build a new application. The program-mers engaged in the development copy from the public repository class definitions they want toreuse or modify for the project: these definitions are added to their private environments as tran-

E. Casais 17

4. Class Versioning

4.1 Issues

Ensuring that class modifications are consistent is not enough; they must also be carried out ina disciplined fashion. This is of utmost importance in environments where a number of program-mers collectively reuse and adapt classes developed by their peers and made available in ashared repository of software components. The early experiences with the Smalltalk systemdemonstrated that the lack of a proper methodology for controlling the extensions and alterationsbrought to the standard class library quickly resulted in a disastrous situation: although everyuser eventually had, at his workstation, an environment perfectly tuned to his needs, the varia-tions between individual class hierarchies were sufficient to hinder the further exchange of soft-ware, or at least to make its porting non-trivial [28].

In the case of single-user environments, the exploratory way of programming advocated bythe proponents of the object-oriented approach requires some support so that a software devel-oper may correct his mistakes by reverting to a previous stable class configuration. When exper-imenting with several variants of the same class—to test the efficiency of different algorithms,for example—care has to be taken to avoid mixing up class definitions and dependencies.

The acute need for structuring class evolution quickly prompted Smalltalk programers todevelop ingenious schemes for recording, grouping and disseminating information on systemchanges [44]. The notion of project was introduced into the Smalltalk system to separate theworkspaces relative to several development tasks. An independent log file associated with eachsuch workspace enables programmers to record and manage the history of revisions brought tothe classes of a project. However, all information is still shared among projects. Thus, everyclass added to the hierarchy in a project is immediately available to all other projects; updates toa class or a global variable are visible in the other workspaces. More elaborate mechanisms relyon a central database to store data about class changes, bug reports, bug corrections and other“goodies” [44]. A browser facilitates the retrieval and insertion of items in the database, whichis accessible by all users on a local network connecting Smalltalk workstations together. Becausethe database acts like a common information pool, complicated procedures are required to incor-porate all bug corrections into the hierarchy, to detect conflicts among the proposed updates, toclean up class definitions, to get rid of obsolete objects, and finally to generate a new runningversion of the Smalltalk system valid for all users. These techniques do not seem to scale wellfor large, distributed programming environments. Current approaches favour a more structuredorganization of software development and a tighter control of evolution based on class version-ing.

Versioning basically consists in checkpointing successive, and in principle consistent statesof a class structure. The creation and manipulation of versions raises complex issues that we dis-cuss in the following sections.

• How is version management organized with respect to software development and shar-ing?

• How does one distinguish between different versions of the same class?

16 Managing Class Evolution in Object Oriented Systems

sequence of operations is applied to a variable of an object, the information stored in the attributeis lost.

Ensuring correctness of class changes is much more difficult than it appears at first sight:additional adjustments have to be performed in order to guarantee the consistency of all classrepresentations. Thus, because a method implementation may depend on other methods andvariables, one cannot consider the deletion of one attribute in isolation. This operation may havefar-reaching consequences if an attribute is excluded from a class interface. Similarly, introduc-ing a new method in a class may raise problems because the code of the method may refer toattributes that are not yet present in the class definition and because of implicit scoping changes.If the method supersedes an inherited routine, subclasses referring to the previous method maybecome invalid. The transformation of method definitions to take into account the consequencesof class updates is not easily amenable to automation, and is not implemented in any of the sys-tems mentioned here. The O2 approach recognizes that class definitions have to be augmentedwith information on inter-attribute dependencies, scoping constraints (when referring to at-tributes that can be inherited from multiple ancestors), and interface usage (which methods froman object’s interface are used by a method in another schema) in order to be able to guaranteethe behavioural consistency of class modifications. This requires a thorough, and perhaps im-practical, analysis of all methods’ code.

3.5 Evaluation

Decomposing all class modifications into update primitives and determining their consequencesbrings several advantages. During class design, this approach helps developers detect the impli-cations of their actions on the class collection and maintain the consistency of class specifica-tions. During application development, it guides the propagation of schema changes to individ-ual instances. For example, renaming an instance variable, changing its type or specifying a newdefault value usually has no impact on an application using the modified class. Introducing ordiscarding attributes (variables or methods), on the other hand, generally leads to changes in pro-grams and requires the reorganization of the persistent store—although the conversion proce-dure can be deferred in some situations.

Depending on its modelling capabilities and on the integrity constraints, an object-orientedprogramming environment may provide different forms of class surgery. It is easy to envision asystem where class definitions are first retrieved with a class browser, and then modified with alanguage-sensitive editor where each editing operation corresponds to a schema manipulationprimitive like those of ORION or GemStone. Such an environment would nevertheless fall shortof providing a fully adequate support for the design and evolution processes. Class surgery formsa solid and rigorous framework for defining “well-formed” class modifications; in this respect,it improves considerably over uncontrolled manipulations of class hierarchies that are more orless the rule with current object-oriented programming environments. But it limits its scope tolocal, primitive kinds of class evolution: it gives no guidance as to when the modificationsshould be performed, and does not deal with the global management of multiple, successiveclass changes carried out during software development.

E. Casais 15

• As with attributes, each object model may define supplementary class properties andtheir corresponding manipulation primitives. Indexable classes in GemStone store infor-mation that is accessed by an integer offset instead of by name, like normal variables,and for which space is allocated dynamically. Classes may have their indexable part re-moved, except when this part is inherited; the modification is not propagated to subclass-es. Classes may be made indexable, provided this operation does not violate a typingconstraint in an already indexable subclass.

• Adding a superclass to a schema is illegal if the inheritance graph invariant cannot bepreserved. In particular, no circuits must be introduced in the hierarchy. The consequenc-es of this operation are analogous to those of introducing attributes in an object descrip-tion.

• The deletion of a class S from the list of ancestors of a class C must not leave the inher-itance graph disconnected. O2 provides a parameterized modification primitive that en-ables the programmer to choose where to link a class which has become completely dis-connected from the inheritance graph (by default, it is connected to OBJECT). One mayalso specify whether the attributes acquired through the suppressed inheritance link arepreserved and copied to the definition of C.

In most other systems, if S is the unique superclass of C, inheritance links are reassignedto point from the immediate superclasses of S to C. In the other cases, C just loses one ofits ancestors; no redirection of inheritance dependencies is performed. Of course, theproperties of S no longer pertain to the representation of C, nor to those of its subclasses.The primitives for suppressing attributes from a class are applied to convert the definitionof all classes and instances affected by this change.

• Reordering inheritance dependencies results in effects similar to those of changing theprecedence of inherited attributes.

3.4 Completeness and correctness

Two important issues have to be addressed to ensure that these primitives capture interesting ca-pabilities. The first one concerns completeness: does the set of proposed operations actually cov-er all possibilities for schema modifications? The second question deals with correctness: dothese operations really generate class structures that satisfy all integrity constraints?

These problems have been studied in the context of the ORION methodology, where it hasbeen demonstrated that a subset of its class transformation primitives exhibits the desired qual-ities of completeness and, partially, of correctness. In contrast, the GemStone approach does notstrive for completeness: only meaningful operations, which can be implemented without unduerestrictions or loss of performance are provided. An interesting result is provided by the O2 ap-proach, where it is shown that although a set of basic update operations may be complete at theschema level (i.e. all changes to a class hierarchy can be derived from a composition of theseessential operations), this same set may not be complete at the instance level, when changes arecarried out on objects and not on classes. For example, renaming an attribute in a class definitionis equivalent to deleting the attribute and then reintroducing it with its new name; if the same

14 Managing Class Evolution in Object Oriented Systems

GemStone, the domain of a variable can be generalized—but not beyond its originaltype, if the attribute is inherited.

GemStone also allows a variable to be specialized, except if the new domain causes acompatibility violation with a redefinition in a subclass. Operations that are neither spe-cializations nor generalizations are not directly supported; moreover, type changes arenot propagated to subclasses. Instances violating new typing constraints have their vari-ables reinitialized to nil.

• Other attribute properties, like the default value of a variable, the realization of a meth-od—or, in Smalltalk, the grouping of messages into categories—can be modified. Theseoperations usually do not require any elaborate consistency checks. Changing the originof an attribute is an operation supported only in ORION. It serves to override default in-heritance precedence rules and is logically handled as a suppression followed by the in-sertion of an attribute. Subclasses are adapted to conform to the new inheritance pattern.

• Shared values correspond to the class variables of Smalltalk and to the shared slots ofCLOS. In ORION, instance variables can be converted to shared variables and their glo-bal value can be modified; the operation is propagated to the subclasses affected by thesechanges. A variable can also revert to being a normal instance variable, in which case itsvalue is reinitialized to nil.

Some environments extend the standard object model with other kinds of attributes.ORION defines composite links as special variables used to implement aggregation hi-erarchies. Suppressing a composite link breaks an aggregation relation by disconnectingan entity from its dependent object.

• Adding a class to an existing hierarchy is a fundamental operation for object-orientedprogramming, and as such it appears in all systems examined here. Connecting a newclass to the leaves of a hierarchy is trivial—possible conflicts caused by multiple inher-itance are solved with standard precedence rules. GemStone allows inserting a class inthe middle of an inheritance graph, provided the new class does not initially define anyproperty: this basic template may be subsequently augmented by applying the attributemanipulation primitives described in the preceding pages. With O2, a new class may beconnected to only one superclass and one subclass initially. The definition must specifywhich inherited attributes are superseded—the redeclaration must comply with subtyp-ing compatibility rules—as well as the newly introduced attributes.

• Removing a class causes inheritance links to be reassigned from its ancestors to its sub-classes. All instance variables that have the class as their type are assigned the sup-pressed class’s superclass as their new domain. GemStone assumes that a class to be dis-carded no longer defines any property and that no associated instances exist in the data-base. O2 forbids class deletion when this would result in dangling references in otherdefinitions, when instances belonging to the class still exist, or when the deletion leavesthe inheritance graph disconnected.

• Renaming a class is allowed only if the new identifier is unique among all class namesin the inheritance hierarchy.

E. Casais 13

tributes if the operation results in naming conflicts or in type mismatches with other at-tributes.

• Attribute renaming is forbidden if the operation gives rise to ambiguities in the class it-self or in its subclasses, or, in GemStone, if the attribute is inherited from an ancestor. Ifthis is not the case, the change is propagated to all schemas affected by the renaming.

• The type of a variable (or of a method argument) can rarely be arbitrarily modified be-cause of the subtyping relations imposed by the compatibility invariant. In ORION and

Scope of changes GemStone ORION O2

Instance variablesadd a variable 3 3 3remove a variable 3 3 3rename a variable 3 3 3redefine the type of a variable 3 3 3change the origin 3change the default value 3

Methodsadd a method 3 3remove a method 3 3rename a method 3 3redefine the signature 3change the code 3 3change the origin 3

Other attributesadd a shared value 3change a shared value 3remove a shared value 3remove a composite link 3

Classesadd a class 3 3 3remove a class 3 3 3rename a class 3 3

Other class propertiesmake a class indexable 3make a class non-indexable 3

Inheritance linksadd a superclass to a class 3 3remove a superclass 3 3change superclass precedence 3

Table 2 A comparison of schema evolution taxonomies.

12 Managing Class Evolution in Object Oriented Systems

that the new type must be a subclass of the original one. O2 also requires a strict compat-ibility between the method signatures of a class and those of its descendents.

• Type variable invariant. The type of each instance variable must correspond to a class inthe hierarchy.

• Reference consistency invariants. GemStone guarantees that there are no dangling refer-ences to objects in the database. When a user holds a reference to an object, then both theobject and the reference to it are retained. Instances can only be deleted when they areno longer referred to. OTGen requires that two references to the same object before mod-ification also point to the same entity after modification.

The designers of ORION list additional rules to select the most meaningful way to maintainthese invariants in the presence of ambiguous class descriptions or when a class schema is mod-ified. These rules serve in particular to solve name clashes caused by multiple inheritance, to de-fine how properties are propagated from a class to its subclasses, and to manipulate the inherit-ance graph.

3.3 Primitives for class evolution

Updates to a schema are inferred from an analysis of the static structure of classes [40]. Thesechanges are subsequently assigned to a relevant category in a pre-determined taxonomy thatcovers all possibilities for schema evolution. Every definition affected by these modificationsmust then be adjusted. If the invariant properties of the inheritance hierarchy cannot be pre-served, the transformation of the class structure is rejected.

• Adding an attribute, whether it is a variable or a method, is an operation that must bepropagated to all descendents of the class where it is initially applied, in order to preservethe full inheritance invariant. When a naming or a type compatibility conflict occurs, orwhen the signature of the new method does not match the signature of other methodswith the same name related to it via inheritance, one either disallows the operation (as inO2 and GemStone), or resorts to conflict resolution rules. Thus, in ORION, if an attributewith the same name exists in the original class schema, it is replaced with the new one.On the other hand, an attribute from a subclass bearing an identical name with the new,inherited one takes precedence over it. In all systems, instances of all modified schemasare assigned an initial value for their additional variables which is either specified by theuser or the special nil value.

• Deleting an attribute is allowed only if the variable or method is not inherited from anancestor. Because of the full inheritance and representation invariants, the attribute mustalso be dropped from all subclasses of the definition where it is originally suppressed1.If a subclass, or the class itself, inherits a variable or a method with an identical namethrough another inheritance path, this new attribute replaces the deleted one. Of course,all instances lose their values for deleted attributes. O2 forbids the suppression of at-

1. It is interesting to note that this propagation is not performed automatically in GemStone.

E. Casais 11

certain structure on class definitions and on the inheritance graph; depending on the object mod-el, they may allow more or less restricted types of class hierarchies.

• Representation invariant. This constraint is explicit only in the GemStone system. Itstates that the properties of an object (attributes, storage format, etc.) must reflect thosedefined by its class.

• Inheritance graph invariant. The structure deriving from inheritance dependencies is re-stricted to form a connected, directed graph without circuits (so that classes may not re-cursively inherit from themselves), and rooted at a special predefined class called OB-

JECT. The GemStone system does not implement multiple inheritance and so requires thegraph to be a tree.

• Distinct name invariant. All classes, methods and variables must be distinguished by aunique name. In addition, ORION requires inheritance links between classes to be un-ambiguously labelled; two links may therefore bear identical names, except if they aredirected to the same class.

• Full inheritance invariant. A class inherits all attributes from its ancestors, except thosethat it explicitly redefines. Naming conflicts occurring because of multiple inheritanceare resolved by applying some precedence scheme and selecting one attribute as the onebeing inherited.

• Distinct origin invariant. No repeated inheritance is admissible in ORION and O2: an at-tribute inherited several times via different paths appears only once in a class represen-tation.

• Type compatibility invariant. The type of a variable redefined in a subclass must be con-sistent with its domain as specified in the superclass. In all systems this constraint means

GemStone ORION O2 OTGen

Representation 3Inheritance graph 3 3 3 3Distinct name 3 3 3Full inheritance 3 3 3 3Distinct origin 3 3Type compatibility 3 3 3 3Type variable 3Reference consistency 3 3

Table 1 Comparison of four object-oriented database systems with respect to their classinvariants. Some constraints (like the representation and type variable invariants)are in fact implicit in most models. The invariants associated to O2 are implied bymore general integrity constraints.

10 Managing Class Evolution in Object Oriented Systems

class changes—on other definitions and on existing instances as well—so that possible integrityviolations be avoided. These methods can be broken down into a number of steps:

1. The first step consists of determining a set of integrity constraints that a class collectionmust satisfy. For example, all instance variables should bear distinct names, no loops areallowed in the hierarchy, and so on.

2. A taxonomy of all possible updates to the system is then established. These changes con-cern the structure of classes, like “add a method”, or “rename a variable”; they may alsorefer to the hierarchy as whole, as with “delete a class” or “add a superclass to a class”.

3. For each of these update categories, a precise characterization of its effects on the classhierarchy is given, and the conditions for its application are analysed. In general, addi-tional reconfiguration procedures have to be applied in order to preserve class invariants.It is, for example, illegal to delete an attribute from a class C if this attribute is really in-herited from an ancestor of C. If the attribute can be deleted, it must also be recursivelydropped from all subclasses of C, or possibly replaced by another attribute with the sameidentifier inherited through another subclassing path.

4. Finally, the effects of schema changes are reflected on the persistent store. If necessary,instances belonging to modified classes are converted to conform to their new descrip-tion. For example, additional storage space must be allocated for objects whose defini-tion has been augmented with additional attributes.

The next sections examine in detail these various aspects of schema evolution management.We base our discussion mainly on the research performed around the following object-orienteddatabase systems:

• GemStone, a commercial system developed at Servio Logic, that extends the basicSmalltalk object model with persistency and database functionality. The OPAL languageis used for defining classes and manipulating objects.

• ORION, a research prototype developed at MCC. ORION is intended to serve as a gen-eral-purpose persistent object-oriented environment integrated with Common LISP.

• O2, a prototype object-oriented database system developed by the GIP Altaïr for support-ing application development in a wide range of domains and programming languages(Basic, C, LISP).

• OTGen, a research system developed at Carnegie Mellon University for investigating is-sues dealing with object-oriented database management.

Since instance conversion techniques are an important issue common to other approaches,most notably versioning, we defer their description to the part devoted to change propagation onpage 182.

3.2 Class invariants

Every class collection contains a number of integrity constraints that must be maintained acrossschema changes. These constraints, generally called class invariants in the literature, impose a

E. Casais 9

when trying to retrieve all students with a Masters degree; the system should avoid processinginstances from FOREIGN-STUDENT, because their Degree attribute does not conform to the in-formation normally associated with students and could even have a different physical represen-tation.

2.4 Evaluation

Tailoring techniques are useful for performing small adjustments to a class collection. The over-riding of inherited attributes enables the programmer to escape from a rigid inheritance structurethat is not always well-suited to application modelling. It facilitates the handling of exceptionslocally, and does not require the factoring of common properties into numerous intermediateclasses. Tailoring mechanisms correspond to constructs of object-oriented languages; conse-quently, they can be considered as a standard programming technique and they can be imple-mented efficiently within compilers.

On the other hand, over-reliance on tailoring and excuses may quickly lead to an incompre-hensible specialization structure, overloaded with special cases, and, as far as persistent object-oriented systems are concerned, difficult to manage efficiently with current database technology.Introducing exceptions in a hierarchy destroys its specialization structure and obscures the de-pendencies between classes, since a property cannot be assumed to hold in every object derivedfrom a particular definition. Renaming and interface redeclaration may completely break downthe standard type relations between classes and subclasses. When signature compatibility is notrespected, polymorphism becomes impossible: an instance of a class may no longer be usedwhere an instance of a superclass is allowed. Nothing prevents programmers to radically alterthe semantics of a method by changing its implementation, say by replacing the code for com-puting the area of a polygon with one that returns its perimeter. Changing attribute representa-tions also cancels many of the benefits of code sharing provided by inheritance.

If tailoring is allowed, one must be wary of developing a collection of disorganized classes.Exceptions should not only be accommodated, but also integrated into the type hierarchy whenthey become too numerous to be considered as special cases. Unfortunately, the techniques wehave described in this section do not really help in detecting design flaws in object descriptions.

3. Class Surgery

3.1 Issues

Whenever changes are brought to the modelling of an application domain, corresponding mod-ifications must be applied to the classes representing real-world concepts. This kind of operationdisturbs an class hierarchy much more profoundly than the tailoring techniques we have de-scribed in the previous pages: instead of overriding some inherited properties when new sub-classes are defined, it is the structure of existing classes themselves which is to be revised. Be-cause of the multiple connections between class descriptions, care has to be taken so that the con-sistency of the hierarchy is guaranteed.

This problem also arises in the area of object-oriented databases, where it has been exten-sively investigated [4][29][43][50]. There, the available methods determine the consequences of

8 Managing Class Evolution in Object Oriented Systems

Possible solutions to overcome these problems include the disciplined use of strict inherit-ance (which leads either to the creation of intermediate classes for purely technical reasons, orto a decrease in the degree of sharing among classes, if they have to specify individually theirversion of the conflicting attributes), the dissociation of classes and types (which defeats poly-morphism), or resorting to default reasoning techniques (whose properties are not always well-determined). The excuse mechanism is suggested as an alternative, flexible way for managingnon-strict inheritance.

With the excusing approach, contradictions between the definition of a class and its ances-tor must be explicitly acknowledged. Thus, while

class STUDENTvariables

Name : STRING;Address : STRING;Degree : (Bachelor, Masters, Doctor-of-Philosophy);

end ;

denotes the standard class of students,

class FOREIGN-STUDENT inherits STUDENT;variables

Degree : (Licence, Diplôme, Doctorat)excuses Degree on STUDENT;

end ;

represents a subclass where the Degree attribute of STUDENT is redeclared. Since the new typeof Degree is not a refinement of the previous one, an explicit excuse has to be provided to informthe system that this situation represents an exception and not a programming error. This excuseis inherited by all subclasses of FOREIGN-STUDENT; in the case where the Degree attribute isonce again superseded, it may be necessary to excuse its new type with respect to the one of FOR-

EIGN-STUDENT—and with the one of STUDENT as well—if they are incompatible.

In his paper, the author sketches a system for integrating types with excuses. The develop-ment of such a scheme would facilitate the type checking of expressions, the detection of unsafeoperations on objects (that refer to values or attributes that have been redeclared in incompatibleways), and the correct handling of database queries (without overlooking exceptional entities).Formally, consider

class Xvariables

a : T;end ;

class Yinherits X;variables

a : S excuses a on X;end ;

where Y is a subclass of X, and T and S denote the type of attribute a. Then, for every instance x∈ X, the following condition holds: x.a ∈ T ∨ (x ∈ Y ∧ x.a ∈ S). This basic property of excusesis used for reasoning about expressions involving entities from exceptional classes, as happens

E. Casais 7

featureparent : like Current ;Create (v : T) is

-- Create tree with single node of node value vdo

fixed_Create (2, v)ensure

node_value = v;right.Void and left.Void

end ; -- Create---- Methods has_left, has_right, has_both, has_none, change_left_child, -- and change_right_child are defined here.--

end -- class BINARY_TREE [T]

Other techniques integrated with a programming language have been proposed for config-uring objects. In HyperCard, individual cards can extend or modify the definition inherited fromtheir common background; this amounts to a per-instance tailoring of objects. With the object-oriented variants of LISP, the programmer chooses how to combine inherited methods in a newclass, and thus controls the way the behaviour of subclasses is determined. In particular, it is pos-sible to extend (with :before or :after methods), shadow (with primary methods), or refer selec-tively to (with :around methods) the operations of a superclass [27][39].

These techniques assume that all adaptations to existing classes are carried out through sub-classing, by overriding some inherited properties. Sometimes however, adaptations cannot belimited to local class adjustments: global changes to the hierarchy are required. Objective-C pro-vides a mechanism where a user-defined class can “pose” as any other class in the hierarchy.When the “posing” class is installed in the system, it shadows the original definition. Objectsdepending on the “posed” class, whether by inheritance or by instantiation, do not have to bechanged: the method dispatching scheme guarantees that a message sent to an object of theposed class actually result in invoking a procedure in a posing object. The posing class mayoverride any method of the posed class, and define additional operations; it has access to all orig-inal, now shadowed, properties. The posing mechanism serves to change a class definition whenits source code is unavailable.

2.3 Excuses

An approach analogous to the attribute redefinition techniques is described in [8]. The authorproposes a formal mechanism for “excusing” abnormal cases that arise when modelling an ap-plication domain and that do not fit within the existing hierarchy. For example, a system keepinginformation on students may have to cope with the case of people who did part of their studiesin foreign countries with different grading schemes and academic titles. Another problem occurswhen attributes with contradicting properties are inherited through different paths. The tradition-al example, originally studied as a test for default reasoning in artificial intelligence, is based onthe situation where a person is at the same time a Quaker (and therefore a pacifist) and a Repub-lican (and therefore opposes pacifism). IS-A relations often exhibit such inconsistencies that canrarely be avoided when a hierarchy is initially constructed.

6 Managing Class Evolution in Object Oriented Systems

• Redefinition enables the programmer to actually alter the implementation of attributes.The body of a method may be replaced with a different implementation, which is thenexecuted in the place of the code inherited from the superclass when the correspondingoperation is invoked on an instance of the subclass. Eiffel also allows the type of inher-ited variables, parameters and function results to be redeclared, provided the new type iscompatible with the old one. Finally, the pre and post-conditions of a method may be re-defined, as long as the new pre-condition (respectively the new post-condition) is weaker(respectively stronger) than the original one.

• Interfaces are not statically defined in Eiffel. A subclass may choose to change its list ofvisible attributes entirely, even when these are inherited. An attribute declared as privatein an ancestor may be made accessible; conversely, a previously visible attribute may beexcluded from the subclass interface.

The following excerpt from the Eiffel library illustrates the use of these various tailoringmechanisms. Notice the changes in class interfaces, the redefinition of the variable parent, andthe renaming and overriding of the operations for creating tree objects.

---- Trees where each node has a fixed number of children (The number of children is-- arbitrary but cannot be changed once the node has been created).--class FIXED_TREE [T]

export start, finish, is_leaf, arity, child, value, change_value, node_value, change_node_value, first_child, last_child, position, parent, first, last, right_sibling, left_sibling, duplicate, is_root, islast, isfirst, go, delete_child, change_child, attach_to_parent, change_right, change_left, go_to_child, wipe_out

inherit...

featureparent : FIXED_TREE [T];Create (n : INTEGER; v : T) is

-- Create node with node_value v and n void children....

end ; -- Create...

end -- class FIXED_TREE

---- Binary trees.--class BINARY_TREE [T]

export start, finish, is_leaf, arity, child, value, change_value, node_value, change_node_value, left, right, has_left, has_right, has_both, has_none, change_left_child, change_right_child

inheritFIXED_TREE [T] rename

Create as fixed_Create, first_child as left, last_child as right redefine

parent

E. Casais 5

• Versioning records the major steps in class design and revision, as a way to deal with si-multaneous updates to a hierarchy and to capture the intrinsic variations that exist in themodelling of an application domain, without loss of information.

2. Class Tailoring

2.1 Issues

Object-oriented programming draws a large part of its power and flexibility from the concept ofinheritance. A class groups a number of attributes that are used either to store information (vari-ables), or to implement the operations that an object supports (methods). New class structuresare easily derived from previous ones through subclassing: one just needs to specify the parentsfrom which the new definition acquires its essential properties, and then to augment this kernelof functionality with some additional, more specialized behaviour. In principle, a subclass hasaccess to all attributes inherited from its ancestors, and so can build on the basic services theyprovide [42][48]. Inheritance has revealed itself to be an extremely powerful mechanism for in-cremental programming; its use has typically resulted in the development of medium size tolarge hierarchies of software components comprising, among others, fundamental data struc-tures (trees, stacks, queues, etc.), user interface tools (windows, menus), and graphical elements(lines, polygons) [12][22][49].

Quite often, programming does not follow the ideal scenario where superclasses, extendedwith additional attributes, naturally give rise to new object descriptions. Inherited variables andmethods do not necessarily satisfy all the constraints to be enforced in specialized subclasses.Typically, one prefers an optimized implementation of a method to the general, and perhaps in-efficient algorithm defined in a superclass. Similarly, a variable with a restricted range may bemore appropriate than one admitting any value. Moreover, subclassing serves several purposessimultaneously, like code sharing, type validation, or modelling generalization/specialization re-lationships; these various goals are difficult to reconcile in a unique structure [23]. Tailoringmechanisms alleviate these problems by allowing the programmer to replace unwanted charac-teristics from standard classes with properties better suited to new applications.

2.2 Language mechanisms

Object-oriented languages have always provided simple constructs for tailoring classes. Thesemechanisms have proved useful to overcome the difficulties caused by strict inheritance, whichforces the programmer either to reuse all attributes from a superclass or to split a hierarchy innumerous partial class definitions—a solution feasible only if multiple inheritance is available.The Eiffel language constitutes a good example of the possibilities afforded by such mechanisms[37]; we present here an overview of its tailoring capabilities—which for the most part exist, per-haps in a different form, in many other programming languages.

• Renaming is the simplest way to effectively modify a class definition. Renamed variablesand methods can no longer be referred to by their previous identifier, but they keep alltheir remaining properties, like their type or their argument list.

4 Managing Class Evolution in Object Oriented Systems

mers on these structures to discover imperfections in the hierarchy and to suggest alter-native designs.

A second problem, related to class evolution, is that instances must be updated after theirrepresentation is modified. We consider in detail three techniques to tackle this crucial issue:

• Change avoidance consists in preventing any impact from class modifications on exist-ing instances, for example by restricting the kind of changes that are brought to objectdefinitions.

• Conversion physically transforms objects affected by a class change so that they conformto the new definition.

• Filtering hides the differences between objects belonging to several variants of the sameclass. This result is achieved by encapsulating instances with an additional software layerthat extends their normal properties.

This article is divided in three parts. The first part explains the principles behind class tai-loring, class surgery and class versioning, referring when appropriate to the research prototypesor industrial products that implement these techniques. For each approach, we illustrate its func-tionality with simple examples, and we give an appreciation of its advantages and of its short-comings. The second part is devoted to class reorganization. It discusses informal guidelines forrestructuring inheritance hierarchies and explains how these guidelines can be automated. Somerecent results of the research performed by the author for developing class reorganization algo-rithms are highlighted; this paper should therefore not be considered strictly as a survey. Our au-tomatic reorganization algorithms are described through several examples, and our contributionis evaluated in the context of the general class design problem. The third part considers the prob-lem of propagating class changes to instances, and analyses how change avoidance, conversionand filtering relate to class modification methods. The conclusion compares the potential of thevarious methodologies and argues for an integrated approach to class evolution.

MODIFYING CLASS DEFINITIONS

The strategies adopted for modifying class hierarchies generally do not allow the programmerto perform free, uncontrolled changes to a collection of classes. The available methods makequite different assumptions on the scope and on the aims of the update operations that can becarried out on object descriptions. They are used at various points during application develop-ment, according to their perspective of class evolution.

• With tailoring, existing class definitions are not directly modified; adaptations are ap-plied to inherited properties when deriving new subclasses. Keeping a stable hierarchyis a fundamental goal with this approach.

• Surgery considers all the implications of direct modifications to a class, and studies whatadditional adjustments are required to leave the whole hierarchy in a valid state. No mat-ter how the classes evolve, a number of essential integrity constraints have to be satisfied.

E. Casais 3

• Software components are difficult to classify in pre-defined taxonomies. Specializing acategory whenever a particular entity does not quite fit into it leads to the famous “oneinstance class” problem [34].

• Experience shows that stable, reusable classes are not designed from scratch, but are“discovered” through an iterative process of testing and improvement [25]. In particular,because of the variety of mechanisms provided by object-oriented languages, the bestchoice for representing a real-world entity in terms of classes is not always readily ap-parent [23].

• In principle, a hierarchy constitutes a standard kernel of functionality that can only beextended with additional subclasses. This may hamper the design and development of in-novative software components if the modelling of objects in the hierarchy does not fol-low the advances in its corresponding application domain.

• Reusing software raises complex integration issues when teams of programmers shareclasses that do not originate from a common, compatible hierarchy. Classes may requiresignificant adaptations—like reassigning inheritance dependencies or renaming proper-ties—to be exchanged between different environments.

In fact, an important assumption must be satisfied for applying such powerful techniquesas inheritance, genericity or delayed binding to application development. Real-world conceptshave to be properly encapsulated as classes, so that they can be specialized or combined in alarge number of programs. Inadequate inheritance structure, missing abstractions in the hierar-chy, overly specialized components or deficient object modelling may seriously impair the reus-ability of a class collection. The collection must therefore evolve to eliminate such defects andimprove its robustness and reusability.

1.2 The solutions

Among the approaches that have been proposed to control evolution in object-oriented systems,we identify the following general categories:

• Tailoring consists in adapting slightly class definitions when they do not lead to easy sub-classing. Most object-oriented languages provide built-in constructs for making limitedadjustments to class hierarchies.

• Surgery. Every possible change to an object definition can be decomposed into specific,primitive update operations. Maintaining the consistency of a class hierarchy requiresthat the consequences of applying these primitives be precisely determined.

• Versioning enables teams of programmers to record the history of class modificationsduring the design process, to control the creation and dissemination of software compo-nents, and to try different paths in a coordinated way when modelling complex applica-tion domains.

• Reorganization of a class library is needed after significant, non-trivial changes arebrought to it, like the introduction or the suppression of classes. Reorganization proce-dures use information on class structures and on the operations performed by program-

2 Managing Class Evolution in Object Oriented Systems

There are a number of reasons that explain why such difficulties arise with the object-ori-ented approach:

• User needs are rarely stable: additional constraints and functionality have to be constant-ly integrated into existing applications, resulting in considerable program restructuring.

CONTENTS

1. Introduction1.1 The problem — 1.2 The solutions

MODIFYING CLASS DEFINITIONS

2. Class tailoring2.1 Issues — 2.2 Language mechanisms — 2.3 Excuses — 2.4 Evaluation

3. Class surgery3.1 Issues — 3.2 Class invariants — 3.3 Primitives for class evolution — 3.4 Completenessand correctness — 3.5 Evaluation

4. Class versioning4.1 Issues — 4.2 The organization of version management — 4.3 Version identification —4.4 Versioning and class evolution — 4.5 Evaluation

CLASS REORGANIZATION

5. Empirical guidelines5.1 Issues — 5.2 General redesign rules — 5.3 Evaluation

6. Restructuring inter-attribute dependencies6.1 Issues — 6.2 The Law of Demeter — 6.3 Application and examples — 6.4 Mechanicalreorganization — 6.5 Evaluation

7. Advanced class reorganization techniques7.1 Issues — 7.2 Model and terminology — 7.3 Principle of the algorithm — 7.4 Applica-tion — 7.5 Evaluation — 7.6 Assessing the potential of class reorganization

CHANGE PROPAGATION

8. Change avoidance

9. Conversion9.1 Issues — 9.2 Instance transformation — 9.3 Immediate and delayed conversion —9.4 Evaluation

10. Filtering10.1 Issues — 10.2 Version compatibility — 10.3 Filtering mechanisms — 10.4 Makingclass changes transparent — 10.5 Evaluation

11. Conclusion

References

1

Managing Class Evolution in Object-Oriented

Systems1

Eduardo Casais

AbstractSoftware components developed with an object-oriented language undergo considerable repro-gramming before they become reusable in a wide range of applications or domains. Tools andmethodologies are therefore needed to cope with the complexity of designing, updating and reor-ganizing vast collections of classes. This paper describes several techniques for controllingchange in object-oriented systems, illustrates their functionality with selected examples and dis-cusses their advantages and their limitations. As a complement to traditional approaches like ver-sion management, we propose new algorithms for automatically restructuring a hierarchy whenclasses are added to it. These algorithms not only help in handling modifications to libraries ofsoftware components, but they also provide useful guidance for detecting and correcting improp-er class modelling.

1. Introduction

1.1 The problem

Object-oriented languages are currently considered a promising approach for coping with theproblems plaguing software development. Mechanisms like classification and the uniformity ofthe object model support the large-scale reuse and recombination of pre-defined software com-ponents—thus boosting programmer productivity. Subclassing allows additional definitions tobe easily accommodated in a class hierarchy, through marginal extensions to inherited proper-ties. With genericity and delayed binding, some characteristics of a class can be left unspecified,to be determined at a later time. Together with interactive tools like browsers and debuggers, ahigh-level graphical interface, and a set of standard reusable classes, object-oriented environ-ments endow programmers with a rich toolkit for exploratory prototyping and present consider-able advantages over traditional methodologies for developing applications incrementally[12][16][21].

However, software developers working with an object-oriented system are frequently ledto modify extensively or even to reprogram existing classes so that they fully suit their needs.This is typically achieved by redefining variables, reimplementing methods, rearranging inher-itance links, etc. Such modifications indicate that the collection of software components is notentirely satisfactory, since it cannot be reused in its current form. As an example, the library pro-vided with Eiffel had to be partly redesigned when the second major version of the language wasreleased, mainly to standardize class interfaces [38]. This problem occurred in spite of accumu-lated and documented experience in building comparable libraries with other programming lan-guages—an unequivocal sign that class design is an intrinsically complex task.

1. Dennis Tsichritzis et auteurs, 1990. Tous droits réservés.