figure 1

University of Aarhus

DAIMI, Department of Computer Science

May, 15th, 1998.

Implementation of Rdbmap a framework for object-relational mapping layers

Michael Thomsen

[email protected]

Michael Thomsen. Implementation of Rdbmap.

IntroductionAs described in (Thomsen, 1998) it is in many situations desirable to interface to relational databases from object oriented languages. The report furthermore suggested that achieving this integration could be done by designing a layer that has the responsibility of mapping the data between the two very different data models present in object models and relations in databases. Through such a mapping layer it becomes possible to invoke high level functions such as save, delete or fetch on individual objects at runtime. The mapping layer then “translates” these high level operations into database specific operations, and furthermore performs the mapping from objects to database records when saving and from records to objects when fetching.

While these mapping layers can be implemented as a part of each and every application that needs this functionality this is very time consuming, so a more fruitful approach would be if an abstract layer could be included and used instead. These “abstract layers” usually fall into one of two categories: layers that are generated from a user provided description of what to map and how, or layers that consist of a large amount of abstract code that is further specialised with a small amount of specific code that performs the actual mapping of each object.

This report describes the design and implementation of a mapping layer of the last kind, for the programming language BETA. The solution, called Rdbmap, is implemented as a framework. This framework contains two abstract classes called rdbClass and rdbRelationship, where rdbClass provides the fundamental functionality needed to map instances of one specific class to and from the relational database whereas rdbRelationship provides functionality to map relations between objects. By creating specialisation’s of these abstract classes one can quickly and easily create a complete mapping layer.

This does not mean that the mapping layer could not be generated automatically based on a specification of the mapping though. Preliminary investigations have shown that given a mapping description the specialisation’s of rdbClass and rdbRelationship can easily be generated automatically.

The general layout of the report will be as follows: We start out by describing the general design of Rdbmap, important design criteria and decisions. Then we in detail describe the implementation of Rdbmap, and finally we describe the implementation of a sample application that uses Rdbmap to make its objects persistent in a relational database.

The design of Rdbmap The design of Rdbmap was initially based on the design considerations in (Thomsen, 1998). In this report the following high-level operations were identified:

Save: The save operation takes as input a single object. If the object is already in the database, the data in the database corresponding to the object will be changed according to the values in the object. If the object is not in the database it is simply added to the database.

Delete: The delete operation takes as input a single object that has previously been fetched from the database, and deletes the data in the database corresponding to the object.

Fetch: The fetch operation takes as input a class and returns a list of all those instances of that class that are currently found in the database. To restrict the number of returned objects search criteria can be added.

of 20.


Update: The update operation takes as input a single object that has previously been fetched from the database. It then compares the values in the object with the values in the database, and if the values in the database have been changed they will be fetched and the values in the object changed.

During the design of Rdbmap is was found desirable to split the save operation into two operations: a create operation that takes a newly created object and creates this object in the database, and a save operation that saves the current state of objects that have previously been fetched. Furthermore the update operation was renamed to reFetch.

Regarding how to provide support for these operations it was as mentioned in the introduction decided to implement the support in the form of a framework. This is not the whole truth though. In the design patterns book (Gamma et al, 1995) a distinction is drawn between toolkits and framework. A toolkit is here a library containing a set of classes that are related and provide a general-purpose functionality, such as e.g. the Beta container library.

A framework on the other hand does not only provide some functionality, it more importantly also provides a general architectural design for the application that uses the framework. Furthermore where the toolkit provides a finished solution, the framework must often be added to, in the sense that the main bodies of the code will be there, but this code might call certain procedures that the user must provide.

The Rdbmap solution shares most of the properties of frameworks as described by Gamma et al. Rdbmap does provide a general architecture for storing objects in relational databases, Rdbmap does contain the main bodies of code that perform this persistence, and the user of Rdbmap must provide implementation of a number of functions. But Rdbmap does not fully decide what the overall architecture of the application that uses Rdbmap should look like. This is regarded to be positive feature though. Since the framework only is concerned with persistence, it should in the author’s opinion only enforce a specific architecture in that area. Infact it was an important design criteria that the framework should be flexible and not too restricting with regards to the architecture.

Design criteria Important in the design of Rdbmap were a number of design criteria that were chosen before the actual design was started. The most important were:

Orthogonality. The framework operations should be orthogonal to the type of the objects operated upon, i.e. it should be able to handle all types of objects, and it should not make any requirement on the objects types, eg. that they should be a subtypes of rdbObject.

High transparency. The framework should be designed with a high level of transparency, in the sense that only the most important database operations on the objects should be explicit, whereas other less important operations should be made transparent to the programmer.

Ease of use. The framework should offer abstractions as strong as possible, in the sense that as little code as possible should have to be provided by the user of the framework.

Flexibility. The framework should only enforce a specific architecture regarding persistence aspects. The user should be able to choose how to organise other parts of the architecture, e.g. user interface – object model communication.

of 20.


Design decisions

The operational interface

The first design decision to be made was how the above mentioned high-level database operations should be invoked. Two possibilities existed: Either the operations could be placed in an abstract interface that the user could inherit from to make his classes persistent. Or the operations could be placed in a separate interface, which the user could call with those objects that were to be handled as parameters.

Following the criteria of orthogonality the last possibility was chosen, as this would not make any requirements on the type of those classes that were to be handled. This was thought to be especially important in a language like Beta since is does not offer multiple inheritance.

More concretely it was decided to create a class called rdbClass, having create, save, delete, fetch and reFetch methods. The user could then to make e.g. the customer class persistent by creating a specialisation of rdbClass, e.g. customerRdbClass, that would act as the persistence interface for the customer class. This was a simple solution as the code that the user was to provide could then be added as bindings of virtual methods placed in the rdbClass. A simplified illustration of the abstract interface and a simple use of it are shown below in Figure 1.

Figure 1. Simplified interface class (left) and simple use (right)

Handling rdb primary keys

In the design report it was in the mapping section discussed how to handle the primary keys. In the report the conclusion was that for each class that was to be mapped either one of the attributes present in the class should be used, or a new attribute should be made to hold the value. Furthermore the report discussed how to handle situations where the primary key was changed in the object after it had been fetched. Shortly recapped the solution presented was that when the object values where read from the database, a copy of the primary key value was stored in the cache. When later saving the object, the original record in the database could be located using the (unchanged) copy of the primary key value found in the cache.

During the design of Rdbmap it became apparent that this solution was not sensible. The problem is that this solution is only correct when there is not more than one client application using the database. In the case where multiple clients operate on the same records, it becomes impossible to locate records again when saving if another client has changed the record’s

of 20.

rdbClass:(# classType:< object;

create: (# theObject: ^ classType enter theObject[] do … #) save: (# … do … #) delete: (# … do … #) refetch: (# … do … #) fetchAll: (# currentObject: ^classType do … inner #) #)#)

(# customer: (# … #); aCustomer: ^customer;

customerRdbClass: rdbClass (# classType:: customer; #); custRdb: @customerRdbClass;

do &customer[]->aCustomer[]; … aCustomer[]->custRdb.save; …#)


primary key in the meantime. For this reason it was decided that it should not be allowed to change the primary key values at any point. This is not a serious limitation though – according to my understanding of relational database design, it is considered good style not to change the primary keys after records have been created anyway.

Generating primary keys

Another complication regarding primary keys not anticipated in the design report was how the generation of new primary keys for new objects/record should be done. As the report suggested storing the primary key value in one of the object attributes, it also (maybe implicitly) suggested that the primary key value should be generated by the application creating the object. This is often problematic though, again especially when having multiple clients, the reason being that a primary key has to be unique among all clients. This means that the key has to be generated at some central, shared point for all clients. This could of course be handled by some sort of key server, but a much simpler solution is just to generate the key at the point that already is shared: The database.

This can be done in a number of ways. The simplest solution is using a database that supports auto-incrementing primary keys. In these databases the primary key column is simply defined when creating the table to be of an auto-incrementing type, e.g. the types Counter in Access, Identity in SQL Server and Counter in Informix. Having done this a unique primary key value is automatically generated as new records are inserted. In databases that do not support auto-incrementing fields as e.g. DB2, these types can be simulated using triggers if the database supports triggers.

Leaving the database with the responsibility of generating the primary key does not solve whole the problem though. As the newly created record needs to be stored in the rdbClass cache, the client program must know the value of the generated primary key. How this problem was solved will be further discussed in the implementation section.

Handling database relationships

The design report also briefly discussed the area of handling relationships between objects, i.e. how much should be fetched when an object related to other objects is fetched. The conclusion in the report was that some sort of lazy-fetch should be used, so that initially when fetching an object only the simple values and the primary keys of all related objects should be fetched. As the relations are followed code should be triggered to fetch the related object using the primary key value. The report contained no discussion of how to implement this lazy-fetch however.

Two approaches where considered during the design phase: implementing lazy-fetch using pointer trapping and implementing lazy-fetch only simulating pointer trapping. Implementing it using pointer trapping refers to the possibility of changing the actual machine code executed as the reference is followed, to instead trigger some code that would fetch the related object, set the reference to point to this newly created object and then follow the reference. This was considered too complicated though as the Beta implementation currently provides no interfaces or hooks for doing this. Instead lazy-fetch was simulated by wrapping the reference into a pattern having set and get methods as will be described later.

Implementation of Rdbmap In this section we start of by describing BDBC, a library for communicating with relational databases using SQL, as Rdbmap will is based on this library. We then describe the final interface of the two main classes in Rdbmap: rdbClass and rdbRelationship. Finally we describe

of 20.


how lazy-fetch and other relationship functionality was implemented by defining a new set of relationship classes.

Overview of BDBCAs mentioned the Rdbmap framework communicates with the relational databases though the BDBC (Beta DataBase Connectivity) library. In this chapter we give a short introduction to this library. For a further description of BDBC and specifically a description of how BDBC is implemented see (Hansen and Wells, 1998).

The BDBC philosophy

BDBC is meant as a library that provides a fundamental interface for communicating with any relational database. ‘Fundamental’ is here meant in the sense that the interface offers “lowlevel” abstractions such as database statements, queries, etc., but not any higher level abstractions such as saving objects or the like.

The database independence in BDBC is achieved by not communicating directly with the native database API. Instead communication is done through the ODBC (Open DataBase Connectivity) API, a generic database interface originally developed by Microsoft, but now under the standardisation by X/Open, ANSI and ISO. This interface provides, combined with database specific ODBC drivers, a uniform way of communicating with all databases that provide an ODBC driver.

For ordinary application programmers ODBC is not suitable to use as a database API directly from Beta though. First of all ODBC is a C-interface so all calls would have to be external calls, which are perhaps not the most pleasant to work with. Furthermore the ODBC interface is very complicated, as is has a very large number of functions with complicated types and errors. So for these reasons and others is it much more pleasant for the application programmer to work with the BDBC interface instead.

The BDBC interface

The BDBC interface is built around the concept of a database connection. A database connection models a single ODBC connection, which again is a named connection to a relational database. In BDBC the interface to a connection is (the full interface can be seen in appendix A):

Connection: (* A connection to an ODBC compliant database *) (# <<SLOT ConnectionLib:Attributes>>; Info: @ (*) ...; commit: (*) ...; rollBack: (*) ...; SQLStatement: (*) ...; DirectSQLStatement: (*) SQLStatement ...; PreparedSQLStatement: (*) DirectSQLStatement ...; ResultSet: (*) ...; ; open:< (* The name of the connection to be opened must be supplied. * Supplying userName and/or password is voluntary. *) ...; close:< ...; connectionException:< BDBCException ...; connectionWarning:< BDBCWarning ...; private: @<<SLOT ConnectionPrivate:Descriptor>> #);

of 20.


To use a connection the application programmer must first create an ODBC connection1. Then by creating a instance of the connection pattern and calling the init method with the name of the connection, and if applicable the username and password as parameters, communication with the database is now possible.

Data manipulation and definition is done using the statement methods.. One can here choose between using either a DirectSQLStatement or a PreparedSQLStatement. A PreparedSQLStatement is different from a DirectSQLStatement in the sense that the SQL statement is parsed and prepared by the database when the statement is initialised. In this way the prepared statement is a little slower to initialise but much faster to execute, which makes a prepared statement more efficient in those cases where the same statement has to be executed several times2.

The statements have the following interfaces:

SQLStatement: (*) (# ContentsType:< ...; Contents: @ContentsType; execute:< (# res: ^ResultSet <<SLOT SQLStatementExecute:DoPart>> exit Res[]#); setColumn: ...; setInteger:< setColumn ...; setReal:< setColumn ...; setText:< setColumn ...; setBoolean:< setColumn ...; ; close: ...; checkColumn:< booleanValue ...; execException:< BDBCException ...; execWarning:< BDBCWarning ...; private: @<<SLOT SQLStatementPrivate:Descriptor>> #);DirectSQLStatement: (* Use this statement type if a statement will be executed at most a few times *) SQLStatement ...;PreparedSQLStatement: (* Use this statement type if a statement will be executed multiple * times with different bindings. * ONLY IMPLEMENTED AS DIRECT STATEMTNT *) DirectSQLStatement (# #);

To use the statement one must first set the SQL contents. This is done by calling the contents method with a text holding the SQL command. The text given as argument can contain a number of placeholders denoted by ?, eg. ‘SELECT * FROM table1 WHERE id=?’. These placeholders can then later be set when their values are known using the setInteger, setReal, setText or setBoolean methods, after which the statement can be executed using the execute method. After execution the content can be changed, the placeholders can be re-set or the statement can be closed.

If the statement that was executed has a result, e.g. if it was a select statement, then this result can be read by transferring the exit value of the execute method to a reference to a resultset. A resultset implements a simplified cursors interface in the following way:

1 How a ODBC connection is created differs from one platform to another. On the Windows platform it is done in the ODBC settings program, that can be found in the control panel.

2 Note though that presently PreparedSQLStatement has not been implemented and is treated as DirectSQLStatement.

of 20.


ResultSet: (* Result of a SQLStatement. * If Info.NoOfCols = 0 then the ResultSet is scanable. * A ResultSet can be scanned at most once. *) (# Result: (# getColumn: (* Get data in column 'i' *) ...; getInteger: getColumn ...; getReal: getColumn ...; getText: getColumn ...; getBoolean: getColumn ...; ; getColumnByName: (* Get the data in the first column in the result designated by 'name'*) ...; getIntegerByName: getColumnByName ...; getRealByName: getColumnByName ...; getTextByName: getColumnByName ...; getBooleanByName: getColumnByName ...; ; resultException:< BDBCException ...; resultWarning:< BDBCWarning ...; private: @<<SLOT ResultPrivate:Descriptor>> #); Info: @ ...; scan: (# current: @Result <<SLOT resultSetScan:DoPart>> #); resultSetException:< BDBCException ...; resultSetWarning:< BDBCWarning ...; private: @<<SLOT ResultSetPrivate:Descriptor>> #);

As is can be seen the interface to a resultset is quite simple. The scan method iterates over the rows that the statement resulted in, and for each of these rows, represented by the result pattern, one can access each of the rows values through the get methods.

The rdbClass interfaceAs introduced earlier (on page 4) rdbClass provides the abstract interface containing the create, save, delete, fetchAll, refetch and lazyFetch operations. The class rdbRelationship has a similar function: It provides operations used to create the relations between objects.

The interface of rdbClass is (the full interface can be seen in appendix B):

rdbClass: (*)(# <<SLOT rdbClassLib:Attributes>>; (* ===== Debug and private stuff =====*) debug:< booleanValue; debugPrint: ...; private: @<<SLOT rdbClassPrivate:Descriptor>>;

(* ===== Virtual declarations ===== *) (* ----- Mics. types ----- *) classType:< (* The class to provide an rdb interface for *) object;

(* ----- Methods for performing the object to/from rdb mapping ----- *)

of 20.


obj2rdb:< (# theObject: ^classType; set...(##) enter theObject[] do INNER #); rdb2obj:< (# theObject: ^classType; get...(##) do INNER exit theObject[] #); initializeObj:< (# theObject: ^classType; theKey: @integer enter (theObject[], theKey) do INNER #);

(* ===== Database primary key handling ===== *) getKeyBeforeInsert:< getKeyBefore; getKeyAfterInsert:< getKeyAfter; getKeyBefore: (# theObject: ^classType; theKey: @integer enter theObject[] do INNER exit theKey #); getKeyAfter: (# theKey: @integer do INNER exit theKey #); getMaxKey: getKeyAfter ...; getKeyFromCache: ...;

(* ===== Initialisation and finalisation ===== *) init:< ...; close:< ...;

(* ===== Methods for the main rdb operations ===== *) create:< (# theObject: ^classType enter theObject[] do ... #); save:< (# theObject: ^classType enter theObject[] do ... #); delete:< (# theObject: ^classType enter theObject[] do ... #); fetchAll: (# current: ^classType do ... INNER ... #); reFetch:< (# theObject: ^classType enter theObject[] do ... exit theObject[] #); lazyFetch:< (# theObject: ^classType; theKey: @integer enter theKey do ... exit theObject[] #);#)

As it can be seen the class has quite a few virtual declarations. These represent the code that the user must provide to make the framework fully functional, and their responsibilities can be summarised as follows:

classType:< Must be final bound to the class that this rdbClass offers a rdb interface for.

obj2rdb:< Must be final bound to perform the mapping of the values of theObject to the database using the set methods

rdb2obj:< Must be final bound to perform the mapping of the values in the database to theObject using the get methods

initializeObj:< Can be final bound if any initialisation of an object is needed the first time it is “seen” by the framework. This is often used to initialise the code in relations between objects.

getKeyBeforeInsert:<

Can be final bound to “calculate” the primary key of theObject before it is inserted into the database

getKeyAfterInsert:< Can be final bound to “calculate” the primary key of theObject after it has been inserted into the database. At least one of the getKeyBefore/After methods must be bound

init:< Must be final bound to a set the variables tableName, keyColumnName, readColumnNames and writeColumnNames to hold the name of the table the instances of classType maps to, the

of 20.


name of the primary key column, the names of the columns that should be read from when fetching and the names of the columns that should be written to when saving.

close:< Can be final bound to provide further finalisation.

Furthermore the rdbClass provides the operations create, save, delete, fetchAll and reFetch that all behave as described earlier. Furhermore another method is present: The lazyFetch method. This method is used to implement lazy-fetch as will be described later.

Implementation of the rdbClass interfaceHaving described the rdbClass interface we will now describe how the six functions create, save, delete, fetchAll, reFetch and lazyFetch have been implemented. The full code for rdbClass.bet and rdbClassBody.bet can be found in appendix B and C.

Initialising the rdbClass

Central to the implementation is five SQL statements that are used in the various do-parts of the six functions. These five SQL statements are allocated in the private part of the rdbClass, and their contents are calculated when the rdbClass it initialised with the table and column names. The following table summarises the five calls:

Statement name Purpose Used by Contents

insertQuery Insert a new record

create insert into %tablename (%writecolumn1, …, [%key]) values (?, …)

updateQuery Update a record save update %tablename set (%writecolumn1=?, …) where %key=?

deleteQuery Delete a record delete delete from %tablename where %key=?

selectOneQuery Select a single record

refetch and lazyfetch

select (%readcolumn1, …)from %tablename where %key=?

selectStarQuery Select all records where ?

fetchall select (%readcolumn1, …, %key)from %tablename [where …]

The five queries are all initialised by using the information about the tablename, the read and write column names and the primary key column name as supplied in the binding of init. The first four queries are completely known at the point of initialisation, so they are instances of preparedSQLStatement, whereas the selectStarQuery is not fully known as the user can supply a where SQL clause when calling fetchAll and it is therefore “just” a directSQLStatement.

To illustrate the approach, assume that a specialisation of rdbClass for a customer class has to be made, and that the customer class has only two attributes: A text called name and an integer named age. The rdbClass would then have an init specialisation that could look like:

of 20.


customerRdb: rdbClass(# classType:: customer; … init:: (# do ‘customer’->tableName; ‘id’->keyColumnName; ‘name;age’->readColumnNames;‘name;age’->writeColumnNames #)#)

This would result in e.g. the insertQuery to be initialised to:

insert into ‘customer’ (‘name’,’age’) values (?,?).

Mapping the objects

As it can be seen in the above table all queries contain placeholders (the question marks) that must be set before the query can be executed. This is where the obj2rdb and rdb2obj virtuals come into play: In the execute part of the queries, the query will call one of these to have its placeholders set. Take as an example the obj2rdb virtual:

obj2rdb:<(# theObject: ^classType; theQuery: ^preparedSQLstatement; nextPlaceHolder: @integer; setOp: ...; setInteger: setOp ...; setReal: setOp ...; setBoolean: setOp ...; setText: setOp ... enter (theQuery[],theObject[],nextPlaceHolder) do INNER obj2rdb exit nextPlaceHolder#);

Returning to the example above the insertQuery would have two placeholders that should be set, namely those two that are ready to hold the values for the name and age columns. To set these, the following specialisation of the obj2rdb virtual in the customerRdb specialisation, would be appropriate:

customerRdb: rdbClass(# … obj2rdb:: (# theObject.name[]->setText; theObject.age->setInteger #)#)

The code executed in each of the set operations is quite simple: It simply increments nextPlaceHolder by one and then forwards the call to the corresponding set method defined on theQuery, which is a BDBC statement as described above.

Mapping in the opposite direction works similarly, only using get operations instead of set operations.

of 20.


Handling the primary key generation

As described in the design section it was realised that the framework should allow the primary key to be generated in the database. But it was also decided that it should be possible to generate it in the application.

Both of these approaches are handled in the implementation through the two virtuals getKeyBeforeInsert and getKeyAfterInsert. If the first is bound, the value of theKey will simply be written in the primary key column when inserting a new record.

The other virtual is a bit more complicated. It is used in the situation where the database generates the key. As described earlier the framework must, after the record has been inserted, be told what the value of the generated key is, as the key is used in the object cache (se below). This is problematic as there is no way of exposing the values of a record that just has been inserted through ODBC and therefore through BDBC. Consequently a work-around solution has been made: By final binding the getKeyAfterInsert to the pattern getMaxKey the framework user instructs the framework that it can find the generated key by performing a SQL select max(id) statement on the table. This is a sensible solution as long at the insert call and the select max call are both executed within one transaction, to ensure that a primary key created by another clients insert call isn’t the one found by ‘select max’.

Handling the uniqueing problem

In the design report the uniqueing problem was mentioned: A single record should always correspond to at most a single object in the memory even though the record is read into memory several times.

This is handled by a maintaining a table of so-called cache objects. These cache objects contain a reference to the object that has been fetched and a copy of the primary key of the corresponding record.

This table is searched when objects are about to be instantiated in the fetchAll and lazyFetch operations: If an entry in the table is found with the same key, then a new object is not instantiated and the object pointed to in the cache object is returned instead.

The cache table is also used when creating a record for new objects. In this case the table is searched to ensure that NO corresponding item is found. If this is the case the object has previously been fetched, and should therefore not be created.

Finally the cache table is used in the save, delete and refetch operations to ensure that the object the operation was invoked upon in fact has been fetched earlier – this is done by checking whether a corresponding item can be found in the cache table.

Handling relationships

How relationships are handled by rdbClass will be treated in the next section on rdbRelationship

The rdbRelationship interface and rdbRelations

rdbRelationship

While the rdbClass handles the mapping of simple attributes to and from the database something more is needed to handle references or relationships between objects. rdbRelationship focuses on these relationships that are represented in the database as

of 20.


relationship tables. These tables are usually used to model a relation between records in two tables, and simply have as columns the primary key columns of both of the two related tables.

rdbRelationship is meant to provide a high-level interface to these relationship tables. It is not usually used by the framework user directly, but is rather used indirectly through the rdbRelations classes discussed below. The rdbRelationship class has the following interface (the full interface can be seen in appendix B):

rdbRelationship:(# (* ===== Debug and private stuff ===== *) debug:< booleanValue; debugPrint: (# ... #); private: @<<SLOT rdbRelationshipPrivate:Descriptor>>; (* ===== Initialisation and finalisation ===== *) init:< (# do ... #); close:< (# do ... #);

(* ===== Functional interface ===== *) addRelation: (# leftKey,rightKey: @integer enter (leftKey,rightKey) do ... #); removeRelation: (# leftKey,rightKey: @integer enter (leftKey,rightKey) do ... #); removeAllLeftRelations: (# rightKey: @integer enter rightKey ... #); removeAllRightRelations: (# leftKey: @integer enter leftKey do ... #); scanLeft: (# rightKey,currentKey: @integer; enter rightKey do ... #); scanRight: (# leftKey,currentKey: @integer; enter leftKey do ... #)#)

The only two virtuals, init and close, perform similar functions as those in rdbClass, and the init method in rdbRelationship must also be bound to set the variables “tableName”, “leftKeyColumnName” and “rightKeyColumnName”. These variables are used as in rdbClass to initialise the SQL statements located in the private part of rdbRelationship.

The operations in rdbRelationship and their functionality are as follows:

addRelation Adds a new relation between two records

removeRelation Removes a relation between two records

removeAllLeftRelations Removes all relations starting from the right record

removeAllRightRelations Removes all relations starting from the left record

scanLeft Scans all records related to a right record

scanRight Scans all records related to a left record

rdbRelations

Having described rdbRelationship that provides an interface to relation tables in the database, we need to consider how we map the relations between objects to the database using this interface. But lets us first take a look at how these relations between objects are normally used in Beta. In the Mjolner Beta System object models are normally created using the CASE-tool Freja, so we will take the code generated by Freja as our starting point. In this tool one-to-many and many-to-many relations are implemented using the “AssociationOne” and “AssociationMany” classes, which have the following interfaces:

AssociationOne:(# element:< Object; elm: ^element; associate: (# enter elm[] #); deassociate: (# do none ->elm[] #); get: (# exit elm[] #)#);

of 20.


AssociationMany: List(# element::< Object; elm: êlement; associate: (# enter append do INNER associate #); deassociate: (# elm: êlement; pos: ^theCellType; notFound:< Notification ...; emptyAssociation:< Notification ... enter elm[] do elm[]->at->pos[]; (if pos[] <> none then pos[]->delete ... else notFound if); INNER deassociate #)#);

As it can be seen these association classes do not hide their implementation very well. As a result, e.g. exchanging them with another set of association classes would require that the new implementation offered at least the same operations. In the case of associationMany this would include the complete interface of the list container! It was therefore decided that it would be more reasonable if associations were represented by a set of association classes with a more restricted interface. A suggestion for such an interface could be (see also appendix D):

relationOne:(# <<SLOT relationOneLib:Attributes>>; (* Datatypes *) element:< Object; relationElement:< (# one: êlement#); (* Methods *) set:< ...; get:< ...; clear:< ...#);relationMany:(# <<SLOT relationManyLib:Attributes>>; (* Datatypes *) element:< Object; relationElement:< (# many: êlement #); (* Methods *) add:< ...; remove:< ...; scan:< ...; clear:< ...; noOfElements:< ...; empty: ...#);

These classes have pure abstract interfaces, so that different implementation can be provided. As an example an implementation using the list container has been made (see appendix E). More importantly in our context, the interfaces can also be implemented to not only handle the relations in the memory but also to map changes in the relationships directly into the database, and to handle lazy-fetch.

Lazy-fetch is easily implemented by furtherbinding the relationElement type to also have a key attribute. The get/scan methods can then be implemented to check whether the reference is none, an if so use the lazyFetch method on the relevant rdbClass to fetch the object, and then set the reference in the relationElement to point to it. Furthermore by implementing the add, remove and clear operations so that they call the corresponding operations in the relevant rdbRelationship class, changes in the relationship are also mapped automatically to the database.

of 20.


In summary implementing the methods on the relationOne and relationMany to automatically map all changes to the database transparent to the programmer is quite easy, if using the rdbClass and rdbRelation classes (see appendix F for details). All that is needed is the above-described code, and that the relation class must be initialised to have references to the relevant rdbClass and rdbRelation instances. But even this initialisation has been easily catered for. By further binding the inilializeObj virtual in rdbClass to perform this initialisation, the initialisation will be done automatically.

Testing the implementationThe implementation of rdbClass and rdbRelationship has been tested both using a structural white-box test, where at least every branch of the code is tested. The test program can be seen in appendix G, and the results of this test can be seen in appendix H.

Furthermore the implementation of rdbClass, rdbRelationship and rdbRelations has been tested using a black-box style test, by implementing a sample application that used the framework. This sample application is described in the next section.

Problems and limitationsA number of limitations and problems with the current solution have been identified. These are summarised here.

Limitations

Platform dependence

Rdbmap is currently platform dependent as is only runs on the Windows 95 and NT platforms. This is not due to Rdbmap though, but due to the fact that BDBC currently is only implemented on these platforms. Plans have been made for implementing BDBC for other platforms such as Unix and Macingtosh.

Problems

fetchAll

I was decided in the original design document that it should be possible to somehow narrow the objects returned by the fetchAll operation by some user supplied criteria. Several ways of expressing these criteria were also suggested. In the current solution this has been implemented in the way that the user can give a text argument to the fetchAll operation and this argument will then be use directly as then SQL Where clause in the select call that fetches the objects. This is a problematic approach at the user then not only needs to now SQL but also needs to know the details of the database schema.

The cache table

Currently the cache table is implemented using a hashtable, to ensure that the objects can be quickly found. The hashfunction is not currently bound to anything though, as the fact that the objects have no unique identifier (e.g. an OID) means that there is no meaningful value to hash on.

of 20.


A sample application

We will now give an example of a small application that uses Rdbmap to make the objects from its object model persistent. The example has two purposes: To give a clearer picture of how Rdbmap should be used, and to convince the reader that the presented solution is sensible.

In general to use the Rdbmap framework the developer would start by developing the object model, user interface and some functionality as usual3. It could even be quite sensible to use another persistence mechanism such as e.g. persistent store in the beginning of the development where there is a large possibility of many often changes to the object model. When the object model is relatively stable the framework can be brought into use in the following way:

The user should start by changing the concrete association pattern used, e.g. list-relations, with the one part of the framework, rdb-relation. This should only be a minor change to the object model, and it should not lead to other changes in the application as list-relations and rdb-relations have the same abstract interface.

Having changed the object model the developer should now for each class in the object model create a specialization of rdbClass, with the relevant virtuals bound as discussed earlier.

The developer should create a specialization of the class rdbInterface, in which a static instance of each rdbClass specializations should be made, and in which these should be initialized and finalized in the init and close methods.

At the relevant points in the program calls to the create, save, etc. operations should be made inside transactions.

We will now describe each of these steps for our sample application. This application is a small application having as purpose holding a database of recipes. The program offers the user the possibility of adding recipes, deleting recipes and changing recipes. Furthermore the program holds a database of foodtypes with nutritional information, so that the complete nutritional information for the recipes can be calculated automatically.

Object modelThe object models contains the following classes:

Food. The food class models a type of food, and has the attributes name, measure (describing in what measure the food is measured), and nutritional information in the attributes fat, protein and carbohydrate.

Ingredient. The ingredient class models an ingredient that is part of a recipe. It has the attribute amount and aggregates food.

Recipe. Models a recipe. Is has the attributes name, directions and aggregation a number of ingredients.

The UML class diagram for the object model can be seen in Figure 2, and the corresponding in appendix I.

3 It is assumed that the CASE tool creating the object model has been changed to generate associations using the association patterns presented in the previous section. If not the user would have to change these manually.

of 20.


Figure 2. UML class diagram

User interfaceThe user interface of the application can be seen in appendix J. For this user interface a programming interface has been made. This interface has three main components: A number of virtuals that are called when e.g. a button is pressed in the interface, a number of set methods that sets the user interface with the given parameters, and a number of get methods that gets current contents of user interface. Furthermore a few more methods are provided for those GUI controls that represent many objects, such as listviews. The full UI programming interface can be seen in appendix K.

Rdb interfaceHaving created the object model and user interface, the rdbClass instances could be created, as they are seen in appendix L, and the database tables could be created using the SQL calls shown in appendix M. These rdbClass specializations classes contain the bindings of the virtuals representing the code that the user of the framework must supply.

The classes are present in two versions: One for the Access database and one for the DB2 database. These two different versions had to be made as the unique primary key handling is different in the two databases – in Access it is automatically handled using a auto-increment field, and in DB2 it is simulated using triggers. There is only a minor difference in the specializations though: The classes for DB2 also bind the getKeyBeforeInsert to insert a null value, which the auto-increment triggers in DB2 can locate.

Finally the rdbInterface specialization creating the instances of the rdbClasses can be seen in appendix N.

Application codeAs the last thing the application code performing the functionallity could be made. As it can be seen in appendix O, this code is quite simple as it can use the high-level operations offered by the rdb interface.

of 20.

RecipeName:

@integerDirections:

@textIngredients:IngredientAmount:

@integertheFood:

FoodName: @textFat:

@textProtein:

@integerCarbohydrate:

@integer Measure: @text


ConclusionAll the important aspects of the Rdbmap framework have now been introduced. It is my experience that having such a framework, it actually becomes if not easy then at least not very difficult to handle the persistence of ones objects in a relational database. Furthermore it is my experience that the amount of code that is needed to be filled into the framework by the application programmer is little and that it is easy to write.

of 20.


References(Hansen and Wells, 1997). Klaus M. Hansen and Lisa Marie Wells. On Beta and Relational

Databases. DAIMI student report, 1997.

(Gamma et al., 1995). Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides. Design Patterns. Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.

(Thomsen, 1998). Michael Thomsen. Persistent Storage of OO-models in relational Databases. COT report no. COT/4-02-V1.5.

of 20.


List of appendixes

Appendix A. BDBC interface

Appendix B. rdbClass and rdbRelationship interface

Appendix C. rdbClass and rdbRelationship implementation

Appendix D. relations interface

Appendix E. list-relations interface and implementation

Appendix F. rdb-relations interface and implementation

Appendix G. Test program

Appendix H. Test program resuls

Appendix I. Example program object model

Appendix J. Example program user interface

Appendix K. Example program UI programming interface

Appendix L. Example program rdbClass specializations

Appendix M. Example program SQL table create calls

Appendix N. Example program rdbInterface specialization

Appendix O. Example program main application code

of 20.

figure 1

Documents

single object

object models

created object

database records

implementation of rdbmap

database specific operations

general design of rdbmap

mapping description