analysis of alternatives for persistence in j2ee

65
Analysis of persistence frameworks in J2EE Analys av beständighetsramverk i J2EE Master’s Thesis Tim Lindgren [email protected] Master’s Thesis in Computer Science (20 credits) At the School of Computer Science and Engineering, Royal Institute of Technology, 4 June 2007. Commisioned by Jadestone Group AB Supervisor at Jadestone: Tjdolf Sommestad Supervisor at Nada: Kjell Lindquist Examiner: Lasse Kjelldahl

Upload: others

Post on 03-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Analysis of Persistence Frameworks in J2EE

T I M L I N D G R E N

Master’s Thesis in Computer Science (20 credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2007 Supervisor at CSC was Kjell Lindqvist Examiner was Lars Kjelldahl TRITA-CSC-E 2007:069 ISRN-KTH/CSC/E--07/069--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.csc.kth.se

Abstract Analysis of persistence frameworks in J2EE The amount of information stored in computer systems is increasing at a rapid rate and the software handling the persistent data of such systems is growing correspondingly more complex and hard to maintain. Persistence frameworks manage the persistence by performing automated conversion between the database systems and the business code using the persistent data. These frameworks have reached wide acceptance. This thesis examines two popular persistence frameworks for the Java Platform, Enterprise JavaBeans and Hibernate. As a means to compare and evaluate the frameworks, two versions of a forum software module are developed, one for each of the persistence frameworks.

Sammanfattning Analys av beständighetsramverk i J2EE Mängden information sparad i datorsystem ökar med en snabb takt och mjukvaran som hanterar beständig data i sådana system blir i motsvarande grad mer komplex och svår att underhållna. Beständighetsramverk utför automatisk omvandling mellan databassystemen och affärskoden som använder sig av det beständiga datat. Dessa ramverk har blivit vida accepterade. Denna rapport undersöker två populära beständighetsramverk för Java-plattformen, Enterprise JavaBeans och Hibernate. För att kunna jämföra och utvärdera dessa ramverk så har två versioner av en forum-mjukvara utvecklats, en för vardera beständighetsramverk.

Contents 1 Introduction ...................................................................................................................................1

1.1 Objectives...............................................................................................................................1 1.2 Scope ......................................................................................................................................1 1.3 Overview of the thesis............................................................................................................1

2 Background ...................................................................................................................................2 2.1 Persistence ..............................................................................................................................2

2.1.1 Database management systems .......................................................................................2 2.2 Persistence terms ....................................................................................................................5

2.2.1 Automated persistence ....................................................................................................5 2.2.2 Transparent persistence ...................................................................................................5 2.2.3 Transitive persistence ......................................................................................................5

2.3 Persistence in Java..................................................................................................................6 2.3.1 Historic evolution of Java persistence.............................................................................6 2.3.2 Overview of the persistence schemes..............................................................................7 2.3.3 The future of Java – Enterprise JavaBeans 3.0 ...............................................................9

2.4 The object-oriented paradigm ................................................................................................9 2.4.1 Object-oriented programming .........................................................................................9 2.4.2 Concept of domain model .............................................................................................10 2.4.3 POJO .............................................................................................................................10

2.5 The Object/Relational impedance mismatch........................................................................10 2.5.1 Cultural aspects .............................................................................................................11 2.5.2 Technical aspects...........................................................................................................12 2.5.3 Object-relational mapping.............................................................................................14

2.6 Persistence solutions ............................................................................................................15 2.6.1 Enterprise Java Beans....................................................................................................15 2.6.2 Hibernate .......................................................................................................................19 2.6.3 Enterprise JavaBeans and Hibernate together ...............................................................21

3 The model....................................................................................................................................22 3.1 Definition of a forum............................................................................................................22 3.2 Requirements for the forum software module......................................................................22

3.2.1 Technical requirements .................................................................................................22 3.2.2 Forum specific functionality .........................................................................................22

3.3 Evaluation criteria ................................................................................................................23 3.3.1 Simplicity ......................................................................................................................23 3.3.2 Minimal intrusion..........................................................................................................23 3.3.3 Transparency .................................................................................................................23 3.3.4 Consistent API...............................................................................................................23 3.3.5 Transaction support .......................................................................................................23 3.3.6 Support for managed as well as unmanaged environments ..........................................23 3.3.7 Support for necessary extras .........................................................................................23 3.3.8 Licensing fees................................................................................................................24 3.3.9 Easiness to deploy software ..........................................................................................24

3.3.10 How easy it is to perform unit tests.............................................................................24 3.3.11 Build time of a project.................................................................................................24 3.3.12 Code size .....................................................................................................................24 3.3.13 Performance, scalability and stability .........................................................................24

4 Implementation............................................................................................................................27 4.1 External software components .............................................................................................27

4.1.1 JBoss..............................................................................................................................27 4.1.2 Ant .................................................................................................................................27 4.1.2 JUnit ..............................................................................................................................27 4.1.3 Log4J .............................................................................................................................28 4.1.4 XDoclet .........................................................................................................................28 4.1.5 Hibernate Annotations...................................................................................................28 4.1.6 MySQL..........................................................................................................................28 4.1.7 The Grinder ...................................................................................................................29

4.2 Design patterns .....................................................................................................................29 4.2.1 Design patterns used for the EJB forum........................................................................29 4.2.2 Design patterns used for the Hibernate forum ..............................................................32

4.3 The Enterprise JavaBeans solution ......................................................................................33 4.3.1 Architecture .......................................................................................................................33 4.3.2 The Enterprise JavaBeans persistence framework ............................................................35

4.3.3 Building and deploying .................................................................................................37 4.3.4 The forum source code ..................................................................................................38

4.4 The Hibernate solution .........................................................................................................42 4.4.1 Architecture ...................................................................................................................42 4.4.2 The Hibernate persistence framework...........................................................................43 4.4.3 Building and deploying .................................................................................................44 4.4.4 The forum source code ..................................................................................................46

5 Evaluation....................................................................................................................................51 5.1 Simplicity .............................................................................................................................51 5.2 Minimal intrusion.................................................................................................................52 5.3 Transparency ........................................................................................................................52 5.4 Consistent API......................................................................................................................52 5.5 Transaction support ..............................................................................................................53 5.6 Support for managed as well as unmanaged environments .................................................53 5.7 Support for necessary extras ................................................................................................54 5.8 Licensing fees.......................................................................................................................54 5.9 Easiness to deploy software .................................................................................................54 5.10 How easy it is to perform unit tests....................................................................................55 5.11 Build time of a project........................................................................................................55 5.12 Code size ............................................................................................................................55 5.13 Performance, scalability and stability ................................................................................56

6 Conclusion...................................................................................................................................57 References ......................................................................................................................................58

1

1 Introduction J2EE, the enterprise edition of the Java platform, is a popular framework for creating multi-tier enterprise applications. It adds new powerful functionality to the standard edition, one of which is providing a framework for handling persistence in the data layer through the use of so called entity beans. However, these have been criticized since their usage has many disadvantages. This thesis examines persistence through employing enterprise beans and a popular and widespread open source solution to persistence, an Object/Relational Mapping tool called Hibernate. The thesis is commissioned by the game software development company Jadestone Group AB.

1.1 Objectives The goal of this thesis is to compare two common persistence frameworks in Java. The coding, building, deploying and running of two forum software module implementations, one for each persistence framework, will form the foundation of the comparison.

1.2 Scope The persistence frameworks offer core functionality and there is third-party software that extends this functionality and offers new solutions. In this thesis, only a basic setup in terms of software components is chosen for each persistence framework. The other permutations fall outside the scope of the thesis.

1.3 Overview of the thesis Chapter 2 provides background information about persistence and the evolution of systems and software managing persistence. It gives a brief overview of database systems and the historical development of tools managing persistence in Java. The chapter also describes object orientation and explains the object-relational impedance mismatch. It then briefly explains the concept of object-relational mapping and provides information about two persistence frameworks using object-relational mapping, Enterprise JavaBeans and Hibernate. In chapter 3 the forum software module is explained and the technical and forum specific requirements are given. Also, the evaluation criteria used for the comparison of the persistence frameworks are described. Chapter 4 contains information about the implementation of the forum software modules using both the Enterprise JavaBeans and Hibernate persistence frameworks. It also describes the external software components and the design patterns used. In chapter 5 the persistence frameworks are compared with respect to a set of criteria explained in chapter 3. Chapter 6 provides a summary of the work.

2

2 Background The technical development in the field of computers has been immense in recent years and the software running on these systems has increased in size and complexity at a corresponding rate. Databases have grown to handle large amounts of information and the business software systems using that information in various ways are complex entities. A few years ago, the coupling between the database and the business layer was explicit, in other words the programmer wrote code that operated directly on the database, which often led to fragile systems. In order to improve the software handling the storage of information in a backend, solutions were developed that provide an alternative to the explicit coupling between database and business layer. One such solution consisted of persistence frameworks, software that performs automatic conversion between the database system and the business code. The first persistence framework to become widely accepted on the Java scene was Enterprise JavaBeans, the persistence part of the Java Enterprise Edition platform. However, the use of Enterprise JavaBeans has been criticized and other persistence frameworks have been developed. Among many others, Hibernate has risen to become a popular candidate to use for persistence solutions. This chapter provides background information about persistence along with related topics that are relevant for the understanding of the problem domain. The aim is to give some historical facts as well as information about the modern technologies in use so that the reader will get a broader perspective of the field that this thesis covers. Many of the concepts mentioned in this chapter may be familiar to the reader in which case they may be skipped during the initial reading and later be used for reference if the need arises.

2.1 Persistence Persistence is the process of writing transient information, transient here meaning something residing in computer memory, to an external data source and is one of the fundamental concepts in application development today. It allows information to be saved to a non-volatile storage and later be retrieved and used. Without this capability, the information stored in memory would be lost when the program responsible for it is closed. In other words, persistence allows data to outlive the process that created it. There exist many different mechanisms for handling persistent data such as databases and file systems. In this thesis, only database systems are of interest and in the following chapter a brief introduction to database management systems is given.

2.1.1 Database management systems The world we know today would not be possible without the widespread use of computers and, with them, the storage of an immense amount of information. This ranges from information about products in a store to the booking system of an airline handling thousands of passengers on a daily basis. In order to satisfy the requirements of these businesses, organizations often rely on advanced database manager systems, DBMS. Originally a highly specific piece of software, these systems are now, due to the increased performance of personal computers, generalized and there exist a few market leading systems which are dominant.

3

A database manager system offer several vital advantages compared to using the plain file system for storage. Storage, organization and retrieval of structured data Naturally a system must be able to store data and let the user retrieve what has previously been stored. Since the amount of data in the database may be vast, these primitive operations need to operate very quickly as well and hence they organize the data in a way that optimize accessibility. Further information about the storage, organization and retrieval can be found under the chapter for relational and object database management systems respectively. Data-level security Data-level security prevents unauthorized persons from viewing the content of the database or updating the stored information. Authentication involving passwords grant users access to the entire database or subsets of it. Concurrency control In the systems of today, many users access a data source at potentially the same time. Concurrency control deals with this issue, allowing a resource to be shared safely and maintains the integrity of the database. Referential integrity Referential integrity is an important concept used when dealing with relational databases and “is an integrity constraint specifying that the value (or existence) of an attribute in one relation depends on the value (or existence) of an attribute in the same or another relation.” [1] Transaction management A transaction is a collection of actions on a database, for example storing, retrieving and deleting data, which form a work unit. A transaction is an “all-or-nothing” approach, which means that either all actions in the unit of work must succeed and the changes are committed to the database, or the transaction is to be rolled back (cancelled). Transactions may be short-lived, running in the time span of a thousandth of a second, or long-lived, taking hours, days, weeks, or even months to complete. Four characteristics of a transaction must be met for a system to be considered safe and it is then called an ACID transaction. ACID is an acronym that stands for the properties of a safe transaction; atomic, consistent, isolated and durable. The terms mean the following:

• Atomic An atomic transaction must execute in its entirety or not at all. In other words, all the tasks within the unit of work must complete without error. If any of the tasks fails, the transaction is aborted and any changes made to the data are undone. If every task succeeds in the execution, then the transaction is committed, meaning that the changes to the data are made permanent.

• Consistent Consistency refers to the integrity of the underlying data store and is a shared responsibility of the transactional system and the application developer. The transactional

4

system must ensure that a transaction is atomic, isolated and durable; the application developer is responsible that the business logic in the unit-of-work does not result in inconsistent data.

• Isolated A transaction must be allowed to execute without interference from other transactions, which means that the data a transaction accesses during execution cannot be affected by any other part of the system until the transaction has completed.

• Durable Durable means that the changes to data during the execution of a transaction must be written to a physical storage before the transaction is considered to be complete.

There exist several conceptually different types of database management systems. In this thesis, only the relational DBMS will be considered since that was one of the requirements for this scientific study. There are, however, several other interesting types and from a technical standpoint, DBMSs can differ widely. The internal organization of information affects how the information is stored and later extracted. Relational database management systems In the year 1970 Edgar F. Codd wrote a number of papers that outlined a new approach to database construction. These eventually culminated in the very famous and important seminal paper “A relational Model of Data for Large Shared Data Banks” [2], which described a new system for storing information and working with large databases. In that paper, and others written at a later date, Codd defined the term relational and one well-known definition of what constitutes a relational database management system is Codd’s 12 rules. The relational model for database management is a data model based on set theory and predicate logic. Codd also laid the foundation of a set-oriented language for manipulating, inserting and retrieving data, a foundation that later spawned the Structured Query Language, SQL, a language based on a branch of mathematics known as tuple calculus. Object database management systems Virtually every information system nowadays is created using an object-oriented programming environment. When using relational database software, a conflict of technologies arises. While the information is being handled by the program, it is stored in an object-oriented form. If a relational DBMS is employed, the information is stored in a relational form. From this follows that software developers must face two worlds. This semantic gap, also known as the impedance mismatch, is explained in more detail in chapter 2.5. Object-oriented DBMSs try to solve this mismatch problem by storing persistent objects directly to the database, an approach that offers advantages, but also has some inherent disadvantages. OODBMSs are not so good at ad hoc queries, reporting and schema evolution according to the article “Politics of Persistence” [3]. Object database management systems were mentioned at this point for giving a more complete picture of the persistence area when it comes to the backend storage, but they will not be used during the evaluation phase in later chapters.

5

2.2 Persistence terms As was described earlier in the background chapter, persistence is a central concept in the field of computer science and, as is often the case, it has generated different techniques. This chapter examines the ones that are used throughout this thesis. Two more terms, specific for the Java Enterprise environment, will be introduced in the later chapter about Enterprise JavaBeans, namely container-managed persistence and bean-managed persistence.

2.2.1 Automated persistence Automated persistence is used for automated code generation in the area of persistence, i.e. code that reads and writes the program information to and/or from a file. This takes care of often tedious details concerning low-level access to persistence storages.

2.2.2 Transparent persistence Transparent persistence is used when artifacts, such as objects in an object oriented context, that are to be persisted are not aware of the underlying persistence mechanism. The artifacts should not be defined differently depending on how they are later stored. In other words, there is a separation of concerns between the persistent artifacts and the persistence logic. No code-level dependencies of the persistence API exist in the persistent artifacts. This leads to a degree of portability since the persistent artifacts are decoupled from a particular persistence solution and it is therefore possible to change it at later date.

2.2.3 Transitive persistence Transitive persistence is a technique that allows the propagation of persistence to affected artifacts. Persistence by reachability The persistence mechanism is said to use persistence by reachability when it employs a recursive algorithm: all artifacts reachable from a persistent artifact become themselves persistent when either the original artifact is made persistent or just before synchronization with the data store. Persistence by reachability requires one or more top-level artifact from which all persistent artifacts can be reached. Figure 1 illustrates persistence by reachability.

Figure 1. Persistence by reachability

6

Persistence by cascading The basic concept of persistence by cascading is the same as for persistence by reachability; artifact associations are used in order to determine the transitive state. However, persistence by cascading allows the specification of a so-called cascade style for each association mapping. These styles are then used by the persistence mechanism and applied when the need arises. Figure 2 illustrates persistence by cascading.

Figure 2. Persistence by cascading

2.3 Persistence in Java The Java platform has, in its relatively few years of existence, become widely accepted in the academic world as well as in the industry. The object-oriented programming language Java was originally intended for use in the consumer electronic market, but offering a good mixture of ideas from other programming languages and a large collection of reusable software components that can be applied to a broad area of computer software, lead to reliable and robust applications on a vast variety of platforms. Java embraced the “write-once-run-everywhere” philosophy and this platform independence together with noticeable mechanisms like automated garbage collection and the concurrency system made it the success it soon came to be. However feature-rich Java was intended to be or has since then evolved to become, it neither did nor does not address the domain of persistence. Similar to many of the languages that have inspired the design and construction of Java, the platform uses the traditional approach of input and output; external data is handled explicitly if needed. The trend of computer software is that most applications require the handling of ever growing volumes of data and the demands for a good persistence solution have seen a similar increase.

2.3.1 Historic evolution of Java persistence There was no initial support for persistence built into Java; the only reference to it was the transient keyword in the Java Language Specification, which was reserved for future use. With evolving versions of the Java Platform, new persistence mechanism entered the scene. They will be briefly mentioned in chronological order in this chapter and more detailed information can be

7

found in the next chapter. The Java Development Kit (JDK) 1.0, the first release of the Java platform, offered a persistence mechanism in the form of data streams that could be connected to the file system. The next major release of the Java platform, JDK 1.1, introduced two new mechanisms for persistence. Java Object Serialization, JOS, supported automated object serialization to and from a stream. The previously mentioned transient modifier found its way to the revised, second edition of the Java Language Specification. The other mechanism introduced was a standard way for Java to communicate with a relational database, the Java DataBase Connectivity API, JDBC. JDBC is easy to use for straightforward access to the database, but object-oriented functionality is hindered by the fact that the bridging requires manual coding. New versions of the JDBC API provide features that enhance the process of coding. In JDK 1.4 inherent problems with the serialization process led to the release of the so called Long-term Persistence for JavaBeans, the ability to read and write beans as a textual representation of its property values. Persistence mechanisms found its way into the Java Enterprise Edition (J2EE), the successful Java platform in the enterprise domain, as well. Enterprise Java Beans, EJB, is a part of the enterprise edition and is an attempt to create a component model for application development with support for transparent persistence and transactions. Parallel to these developments, a few other persistence mechanisms are worth mentioning. The problems with using straight JDBC can be reduced by automating the process of converting from objects to an underlying database; tools that support this are called object-relational mapping tools. The Java Community Process introduced Java Data Objects (JDO), an object-relational mapping (ORM) tool that supports a wide range of persistent stores. JDO is a standard driven by the above mentioned community and lets developers take full advantage of the object paradigm with minimal intrusion on normal Java code. Another such tool is Hibernate, which has received an immense user base in its relatively short lifetime and will be described in more detail in chapter 2.6.2.

2.3.2 Overview of the persistence schemes The previous chapter listed a few persistence solutions that have evolved as part of the Java language and in parallel to it. In this section, these schemes will be described more thoroughly. [4] Java Object Serialization Java Object Serialization (JOS) is part of the standard Java platform and it allows objects to be marked for serialization. This is achieved by implementing an interface with no methods. Serialization extracts the persistent data from the object to be serialized and writes it to an output stream, which can be attached to an external data source. The stored object can then be retrieved by an input stream. JOS saves the internal representation of an object, which means that it is a correct copy of the original, but it introduces a dependency to the particular implementation of the class of the object. The dependency also stretches to super classes. In other words, a persisted object can only be retrieved when the class upon which it depends is exactly the same as those in the environment