a model-driven transformation method · the key to mda lies in three factors: ... (pim) and...

A Model-Driven Transformation Method

Jana Koehler Rainer HauserIBM Zurich Research Laboratory

CH-8803 RueschlikonSwitzerland

email:�koe � rfh � @zurich.ibm.com

Shubir Kapoor Fred Y. Wu Santhosh KumaranIBM T.J. Watson Research Center

PO BOX 218 RTE 134Yorktown Heights, NY 10598, USA

email:�

shubirk � fywu � sbk � @us.ibm.com

Abstract

Model-driven architectures (MDA) separate the businessor application logic from the underlying platform technol-ogy and represent this logic with precise semantic models.These models are supposed to span the entire life cycle of asoftware system and ease the software production and main-tenance tasks. Consequently, tools will be needed that sup-port these tasks.

In this paper, we present a method that implementsmodel-driven transformations between particular platform-independent (business view) and platform-specific (IT ar-chitectural) models. On the business level, we focus on busi-ness view models expressed in ADF or UML2, whereas onthe IT architecture side we focus on service-oriented archi-tectures with Web service interfaces and processes specifiedin business process protocol languages such as BPEL4WS.

1 Introduction

Model-driven architectures (MDA) have been proposedby the OMG to reinforce the use of an enterprise architec-ture strategy and to enhance the efficiency of software de-velopment. The key to MDA lies in three factors:

� the adoption of precise and complete semantic modelsto specify the structure of a system and its behavior,

� the usage of data representation interchange standards,

� the separation of business or application logic from theunderlying platform technology.

Each of these key factors poses a challenge of its own,but the model representation seems to be the most criticalamong them. Models can be specified from different views,from the business analyst’s or the IT architect’s view, andthey can be represented at different levels of abstraction.

MDA distinguishes between platform-independent (PIM)and platform-specific (PSM) models, but we can expect thatmany different PIM and PSM models will be used to de-scribe the different aspects of a system. Representing amodel of a system (be it a PIM or a PSM) is only one sideof the coin. In order to be useful, a model must be furtheranalyzable by algorithmic methods. For example, the fun-damental properties of the behavior of a system should beverifiable in a model, and different models of the same sys-tem should be linked in a way that relevant changes in onemodel can be propagated to the other models to keep themodel set consistent.

In this paper, we develop a method to transform partic-ular platform-independent into particular platform-specificmodels. On the PIM side, we use business process mod-els that reflect a business analyst’s or consultant’s view ofa business process. Typical representatives of these modelsare the ARIS tool set by IDS Scheer[15], the ADF approachby Holosofx/IBM [4] or UML2 activity models [17]. Onthe PSM side, we distinguish between IT architectural mod-els and IT deployment models. The IT architectural modelsthat we generate are service-oriented and focus on a spe-cific programming (not hardware) platform using Web ser-vices [2] and workflows [8]. Therefore, we consider themto be PSMs. The runtime-platform specific information iscaptured in the IT deployment model, which further refinesthe architectural model, and, for example, provides the portand binding information for Web services or the physicaldeployment of a workflow on a particular workflow engine.We do not consider IT deployment models in this paper.

It is generally acknowledged that an IT architectureshould be aligned with the business processes and it shouldbe the business process model that determines the IT ar-chitecture and not the other way round. In this sense, onecan regard the PIM as the specification and the PSM as asolution satisfying the specification. Since model-driven ar-chitectures aim at spanning “the life cycle of a system frommodeling and design to component construction, assembly,integration, deployment, management and evolution” [9]

we can expect that

1. several different models will exist that describe a sys-tem from different points of view using different levelsof abstraction and

2. the models will evolve following the life cycle of a sys-tem.

Therefore, it seems necessary that algorithmic tech-niques be developed and tools be provided that support theevolution of models and the consistency of different model-ing views of the same system. In this paper, we present ini-tial results in this direction and discuss some of the methodsthat we developed for a prototype tool—called the transfor-mation engine (TE)—that addresses these tasks.

The paper is organized as follows: In Section 2, we dis-cuss the role of model-driven transformations and presentthe architecture of our transformation engine. Section 3 re-views different business process modeling approaches andextracts structured information flows as their underlyingcommon modeling principle. Section 4 summarizes the ITarchitectural models that we focus on and presents methodsto synthesize selected IT architectural solution components.In Section 5, we discuss the correctness of model-driventransformations. In Section 6, we review related work andconclude with a summary and an outlook on the challengeof dynamic model reconciliation in Section 7.

2 Role of Model-Driven Transformations

We have currently developed top-down and bottom-upmodel-driven transformations between particular PIM andPSM. The top-down transformation starts from the businessview model of a system or process as the PIM and derivesa platform-specific IT architecture model, the PSM. Thisensures that the business model determines the relevant el-ements of the IT architecture. It transforms the blueprintof a system as it is envisioned during a business analysisand re-engineering process into an IT architecture that willimplement that blueprint.

The bottom-up transformation takes a platform-specificIT architecture model as input and abstracts it into a busi-ness view model. It allows the propagation of structuralchanges in the IT architectural model back to the businessview model and thereby supports communication from ITpeople to business people.

Both transformations prolong the life cycle of the busi-ness view model. It is no longer only actively used in thedesign and analysis phase, but plays an active role in theentire system life cycle by constraining the IT architecture.The transformation engine has been designed to transformdifferent business view models automatically into IT ar-chitectural models and vice versa. At both sides, a vari-ety of approaches with particular strengths and weaknesses

compete with each other. Therefore, we introduced model-independent layers into the TE architecture. This allows usto handle a variety of PIMs and PSMs, see Figure 1.

Business View Model

Process Graph

Flow Graphs

Solution Components

Model-Specific Reader Model-Specific Writer

Subflow Extraction Subflow Merging

Platform-Specific Synthesis Platform-Specific Discovery

Figure 1. Architecture of the TransformationEngine

On the business view side, the TE can handle model-ing approaches that structure a business process into atomicprocess steps and subprocesses. Atomic process steps arenot further refined, whereas subprocesses can be nestedwithin each other and lead to a hierarchical process model.The process elements (subprocesses, process steps) arelinked by typed information entities that represent the flowof control and physical data objects or event messages be-tween the process elements. ADF [4], ARIS [15], UML2activity models [17], and Business Operational Specifica-tions (BOpS), a yet unpublished modeling approach by IBMResearch, are typical representatives of these modeling ap-proaches. They all support hierarchical process models thatdescribe processes as structured information flows acrossdistinguishable process elements.1

The transformation engine contains specific readers andwriters for each modeling approach. When performinga top-down transformation, the reader reads in a specificbusiness view model and maps it to a so-called processgraph, which we have developed as a generalization ofthe information-flow models characterized above. The pro-cess graph is then further analyzed to extract the informa-tion type-specific subflows and then decomposed into flowgraphs representing these subflows. The actual transforma-tion therefore takes place between the process graph on oneside and and the flow graphs on the other side. From theflow graphs, an IT architecture-specific synthesis methodgenerates solution components that are required in the de-

1The models we consider in this paper only comprise a single process-oriented view. Work on transformation methods for multi-view models (atypical example is the OMG Enterprise Collaboration Architecture) is inprogress.

SuccessfulTrip

TripDocument

FailedTrip

TripDocument

TripCancellation

Telephone Call

Cancel

CNC

CreateItinerary

Customer

FlightReservations

FlightRequests

HotelReservations

HotelRequests

Phone Order

Telephone Call

DecideStatus ConfirmItinerary

Failure

CancelEvent

NeededFlights

FlightRequests

ReplanItineraryFlightService

P

Cancel

CNC

NeededHotels

HotelRequests

HotelService

P

TripReservation

TripDocument

Cancel

Customer

CompletedTrip

TripDocument

Figure 2. ADF representation of the trip-handling process.

sired IT architectural model. In this paper, we discuss thegeneration of Adaptive Documents (ADocs) [10], which de-scribe persistent business documents whose life cycle iscontrolled by an associated state machine, and business pro-tocols between Web services (specified as BPEL4WS [3]processes). We do not discuss the generation of the Web ser-vice interfaces for the solution components, nor do we con-sider additional solution components such as screenflows,connectors, business objects, or others, and the “wiring” ofthese components with each other.

When performing bottom-up transformations, the TE isprovided with an IT architectural model comprising a setof solution components from which it synthesizes the flowgraphs that are merged into a single process graph. Thisgraph can then be mapped to a specific business view modelusing a model-specific writer. In this paper, we will limitour discussion to the top-down transformations, which wedeveloped.

3 Business Models based on Structured In-formation Flows

Let us introduce our generalization of business viewmodels to process graphs with the help of an example,namely the trip-handling process of a travel agency. Theagency receives trip requests from customers over thephone. The requests are recorded in trip documents, fromwhich several flight and hotel requests are derived by theagent. The flight and hotel requests are handled by spe-cialized services that receive these requests from the travelagency. The services report their success or failure in book-ing the requests back to the agency. If all requests weresuccessfully handled, the agency puts together the trip doc-uments for the customer and confirms the travel itinerary.If one of the services fails to make a requested reservation,the agency immediately informs the other service to put all

reservation requests on hold and contacts the customer torevise the trip itinerary.

Figure 2 shows an ADF model of this process. The cus-tomer is an external entity (depicted with a white oval) ini-tiating the trip-handling process. Three main elements areused to capture the structure of the process: (1) Grey octo-gons depict the elementary process steps, which are namedas CreateItinerary, ConfirmItinerary, and ReplanItinerary.(2) White boxes containing a “P” letter depict subprocessesand refer to the FlightService and HotelService. (3) Blackdiamonds show binary decision steps in the process.

Typed information entities flow between the various pro-cess steps. In the example model, we distinguish betweenfive different types of information entities named Tele-phoneCall, TripDocument, FlightRequests, HotelRequests,and CancelEvent. Each of them is depicted with a sepa-rate so-called Phi symbol. Below each Phi symbol, we findthe type and above the symbol, we find the Phi name (theparticular instance of this type).

A star-shaped goto symbol allows to model cyclic pro-cesses. In our example, the Phi Failure of type CancelEventis sent from the ReplanItinerary step back to the FlightSer-vice and HotelService subprocesses via the CNC goto sym-bol. Which of the services has to be notified is decided inthe decision step receiving the Phi.

The UML2 activity diagram in Figure 3 shows anothermodel that we created for our example process. It modelsthe customer and the trip-handling process as two activi-ties, which are linked via flows of the order event (OE),trip request (TR), and booking failure (BF) token. The trip-handling activity is further structured into actions that wealready distinguished as process steps in the ADF model.The Flight- and HotelService activities are contained withinan interruptible region. They are linked to the other actionsvia the flow of flight request (FR) and hotel request (HR)tokens and may be interrupted by a booking failure token.

A merge node combining TR, FR and HR tokens and gen-erating a new TR token and a decision node, which decidesabout the failure or success of a trip reservation, are alsocontained in the model.

Figure 3. UML2 representation of the trip-handling process.

The two modeling approaches have several features incommon. First, they structure a process into different pro-cess elements:

� subprocesses, which can optionally be further refinedand which are either internal or external given the cur-rent view of the process,

� atomic process steps that actively produce, or receiveand modify one or more information entities, and

� control steps that inspect and merge information enti-ties and represent the decision logic of a process de-pending on the information content.

Second, both models capture the flow of information en-tities, which can be object documents of a particular type,events, or any other messages. Arbitrary types of informa-tion entities can be defined by a modeler and then used toannotate links between the process elements. Table 4 sum-marizes the specific elements from each modeling approachand how they fit into the categorization as subprocesses,process steps, control steps, and information entities.

Neither of the two graphical representations is reallyunambiguous in the sense that a human (or a program)could derive insights into the dynamic behavior of the pro-cess merely by looking at the graphical (structural) mod-els. Without the informal process description given above,different readers would probably come up with quite dif-ferent interpretations of the models regarding details of thedynamic behavior of the process. Consequently, both ap-proaches provide many ways of annotating process modelswith additional information. For example, UML2 allows a

ADF UML2

subprocess process activityexternal taskexternal entity

process step task actioncontrol step decision decision, merge

choice fork, joininformation entity Phi tokenflow connector link

stop initial markergoto final marker

Figure 4. Categorization of modeling ele-ments in ADF and UML2.

modeler to describe the pre- and postconditions of actionsto further illustrate possible process behaviors, but leavesthe choice of language to formulate these conditions to thehuman modeler.

These (and other) approaches to business process model-ing have been developed with the goal to support a group ofdesigners and analysts, who need to achieve a common un-derstanding of the process behavior. Many details of theprocess behavior are thus either not visible in the modelrepresentation or contained in natural language descriptionsthat are inaccessible to an automatic analysis tool. Evenif dynamic behavior is sketched in the model, for exam-ple the interruptible region that we identified in the UML2model, our example models are not detailed enough to en-able a tool or a user to derive an unambiguous interpretationof the process behavior. The only unambiguous informationthat the models provide is the specification of the static flowof typed information entities between structural process el-ements. In this paper, we investigate what a transformationmethod can accomplish when limited to consider only suchstatic information flows, because these can be automaticallyextracted from a model without facing a high risk of misin-terpreting the business view model. We developed an inter-mediate representation called a process graph that allows usto

� map the static information flows represented differ-ently in different modeling approaches to the same un-ambigous representation

� assign a formal, state-based operational semantics tothe process graph representation that is exploited bythe transformation methods.

3.1 Process Graphs

A process graph �� is a formal model torepresent structured information flows.

Definition 1 Let � � �� be a directed graph withvertices � , and edges , and let � be a not further specifiedtype system.

� � � � �� is the set of vertices with � �being the process steps, �� the control steps, �� theabstract external processes, and �� the abstract inter-nal processes.

� � � � � �� , with � � �� ,are the typed and directed edges of the graph with �being the object edges (carrying objects of type �� )and �� the event edges (carrying events of Type �� ).

The set of type names denotes types of distinguished in-formation entities that are input and output of the verticesin the graph. We only consider two disjoint sets of typesto denote events and objects as the major two groups of in-formation entities. Both type sets can be further refined byarbitrary model-specific subtypes.

Definition 2 A vertex �� has a signature � �� of type names �� from a given type system � . We distinguishin � the input signature �� ! "�$# �&%&%&% ��$'�( and the outputsignature �� ) "�+* �&%&%&% ��-,.( .

Definition 3 An object constant is a name to denote an ob-ject that is described with a (possibly empty) set of proper-ties specified in some language L. An assignment of valuesto these properties is the state of the object.An event constant is a name to denote a not further de-scribed, stateless object. Each object or event constant hasa specific object or event type.

Process steps as well as abstract internal and externalsubprocesses can change the state of one or more objects,whereas control steps cannot. Internal and external pro-cesses are only distinguished from process steps to repre-sent the process boundaries that are often identified in busi-ness view models. Otherwise, their semantics is the same,i.e., they can change the state of objects.

Definition 4 A typed directed edge / � �0�1� ��"2 ��+* � fromvertex �3� to vertex �"2 can be added to the graph if �4*5�� 0�3� ��67�8� �0�"2 � , i.e., the type must exist in the output sig-nature of the vertex where the edge starts and in the inputsignature of the vertex where it ends. Note that 9;:�5< mustnot necessarily hold.

Edges establish a typed link from one vertex to anotherto show the transfer of information entities between pro-cess elements. The type of the information entity does notchange during the transfer. The edges define paths acrossthe vertices of the graph and induce a partial ordering of thevertex elements.

Definition 5 A vertex �1� is ordered before a vertex �=2 (writ-ten �3��>!�"2 , 9?:�)< ) if there is a path from �1� to �"2 , but nopath from �"2 to �3� .

There are two cases where vertices can remain unorderedwith each other: either there is no path between them at allor they lie on a cycle in the graph. Neither of these cases isa problem for our approach.

3.2 Building and Analyzing Process Graphs

A process graph only depicts the structure of the pro-cess and the principal information flow between process el-ements, i.e., it is a very abstract process model that doesnot allow the study of the dynamic behavior of the process.Figure 5 shows the process graph that we generate from theUML2 and ADF example models. The symbols TR, OE,HR, FR, BF are the token names from the UML2 modelthat we reused to name the information entities in the pro-cess graph.

process step

internal process

external process

control step

event flow

object flow

Customer Eval1Create

Itinerary

Replan

Itinerary

Confirm

Itinerary

Hotel

Service

Flight

ServiceEval2

OE

TR

HR

FR

BF

TR

TRTR

FR

HR

BF

BF

BF

Figure 5. Generalization of business viewmodels in the process graph.

For each modeling approach, a specific reader analyzesthe model description and extracts model elements that canbe mapped to the process graph elements following the ba-sic mapping principles shown in Figure 4. We use a repos-itory symbol to depict control steps in order to emphasizethat control steps inspect the state of objects, but do notchange them (in contrast to the other process elements).The interruptible region in the UML2 activity model andthe lightning arrow from the ReplanItinerary process stepinto this region are mapped to a new subgraph containing anevent link from this process step to a new control step, fromwhich event links lead to all process steps or subprocessesin the region. If a model establishes a link between twoprocess steps without assigning to it any information entity,which is usually known as a control link, then we generate

a new event constant and map the control link to an eventlink in the process graph. This guarantees that the orderingbetween two activities that are linked with the control linkis preserved in the process graph via the event link.

Note that a process graph is flat, i.e., it has no hierarchi-cal structure. If a business view model contains a hierarchi-cal process model where a subprocess at a higher level isrefined with a separate process model at a lower level, thenthe subprocess vertex in the process graph can be replacedby the process graph obtained from the lower-level processmodel if desired. It is up to the modeler to decide whethersubprocesses should be kept as abstract process steps or re-fined into their subprocesses. The replacement only suc-ceeds if the signature of the subprocess vertex is identicalto the signature of the subprocess graph, which is the decla-ration of incoming and outgoing information entities for thesubprocess. The model-specific readers follow very closelythe informal semantics as it is described in the modelingguidelines of the various approaches, but obviously we can-not establish a formal equivalence guarantee of the processgraph with the original model.

After the process graph has been constructed, it becomesinput to the transformation methods, which extract varioussubgraphs from it:

� The subgraphs specifying the flow of one object typebetween process steps and internal subprocesses showsthe process-internal workflow of this particular objecttype.

� The subgraphs containing an external subprocess ver-tex and all process-step vertices that have direct edgesto this external subprocess vertex show the interface ofa process to its external partners. The signature of theexternal subprocess vertex indicates the message typesthat are exchanged between the partners. Similarly, theinterface of a process to its internal partners can bedetermined by extracting the corresponding subgraphsfor the internal subprocess vertices.

� Flows of one object type that spawn more than oneinternal process step indicate stateful business objects.

Figure 6 shows the individual subflows for the three ob-ject types (trip request, hotel request, flight request) that wediscovered in the trip handling example.

Figure 7 shows the directed information flows betweenprocess steps (white rectangle or diamond) and external orinternal subprocesses (grey hexagon).

Even with very little information only sketching thestatic information flow, we can still derive useful IT archi-tectural components. In the following, we will demonstratehow the subflow graphs will be mapped to selected solu-tion components of an IT architecture by platform-specificsynthesis methods.

Eval1Create

Itinerary

Confirm

Itinerary

Replan

Itinerary

Eval1Create

ItineraryHotel

Service

Flight

Service

Create

Itinerary Eval1

Trip Request

Flight Request

Hotel Request

Figure 6. Three detected object types andtheir processing flows.

Create

Itinerary

Confirm

Itinerary

Replan

Itinerary

Customer

Hotel

Service

Flight

Service

Evaluate

Evaluate

FR

HR

BF

TR

HR

TR

TR

TR

FR

TR

BF

BF

OE

Figure 7. Information flow dependencies be-tween the steps of the trip-handling processand its partners.

4 IT Architectural Models and the Auto-matic Synthesis of Solution Components

Our target IT architecture is based on Web services—self-contained, modular units of application logic whichprovide business functionality to other applications via anInternet connection. In a nutshell, a Web service is specifiedby defining messages that provide an abstract definition ofthe data being transmitted and operations that a Web serviceprovides to transmit the messages. Operations are groupedinto port types, which describe abstract end points of a Webservice such as a logical address under which an operationcan be invoked. Concrete protocol bindings and physicaladdress port specifications complete a Web service specifi-cation.

Web services support the interaction of business part-ners and their processes by providing a stateless modelof “atomic” synchronous or asynchronous message ex-changes. These “atomic” message exchanges can be com-posed into longer business interactions by providing mes-sage exchange protocols that show the mutually visiblemessage exchange behavior of each of the partners in-volved. An example of such a protocol specification lan-guage is BPEL4WS [3]. In the following, we show how thebasic elements of a BPEL4WS protocol can be derived fromthe process graph.

4.1 Synthesis of BPEL4WS Protocols

The starting point for the synthesis of a BPEL4WS pro-tocol is the information contained in Figure 7. It shows thetrip-handling process with its individual process steps (allwhite-colored vertices) and its three communicating part-ners: (1) the customer who is an external partner to thetravel agent (top grey hexagon), (2) the flight service whichis an internal partner (right grey hexagon), i.e., a subprocessthat we have not yet further specified, and finally (3) the ho-tel service which is also an internal partner dealing with thehotel request processing (left grey hexagon). This informa-tion about partners is mapped to the partner declaration inBPEL4WS. To make the example more interesting, we as-sume that the interaction with all partners proceeds via aWeb service interface.

<process name ="TripHandling"><partners>

<partner name ="Customer"myRole ="TripHandlingAgent"

serviceLinkType ="CustomerServiceLink"partnerRole ="CustomerAgent"/>

<partner name ="FlightService"myRole ="TripHandlingAgent"

serviceLinkType ="FlightServiceLink"partnerRole ="FlightServiceAgent"/>

<partner name ="HotelService"myRole ="tripHandlingAgent"

serviceLinkType ="HotelServiceLink"partnerRole ="HotelServiceAgent"/>

</partners>...

</process>

The name of the process is derived from the name of thebusiness view model. The names of the partners are me-chanically derived from the external and internal subpro-cess names. The partner role names are composed of thepartner names with the suffix Agent added, whereas thevalue of the attribute myRole is again the model name plusthe Agent suffix. The names for the service link types arealso mechanically derived from the partner names.

For each information entity that we discovered in thebusiness view model and that occurs in the signature of atleast one process step in the trip-handling process, a mes-sage name and corresponding container name is defined.

<containers><container name ="OrderEvent"

messageType ="OrderEventType"/><container name ="TripRequest"

messageType ="TripRequestType"/><container name ="FlightRequests"

messageType ="FlightRequestType"/><container name ="HotelRequests"

messageType ="HotelRequestType"/><container name ="BookingFailure"

messageType ="BookingFailureType"/></containers>

The most interesting task, but also the greatest challenge,is to derive the control flow of the message exchange be-tween the process and its partners. The directed edges be-tween the vertices in the graph as shown in Figure 7 pro-vide the required information. Directed edges between twointernal process steps provide ordering information, i.e., inthe example CreateItinerary comes before Evaluate, Evalu-ate comes before ReplanItinerary and ConfirmItinerary, butthe latter two are unordered with each other. All steps pro-ceed before the second Evaluate. This information yieldsthe control structures for the BPEL4WS specification. Stepsthat are ordered with each other are put within a <se-quence> activity, while unordered steps are put within a<flow> activity. After each control step, a <switch>activity is introduced.

<sequence>createItineraryEvaluate<switch>ConfirmItinerary<otherwise>

<sequence>ReplanItineraryEvaluate<switch>HotelService<otherwise>

FlightService</otherwise>

</switch></sequence>

</otherwise></switch>

</sequence>

In a second step, the mnemonic step names have to bereplaced with the message interactions a step has with apartner. Each directed link that links a step vertex with apartner vertex is mapped to a basic BPEL4WS activity. Adirected edge from a partner vertex to a step vertex is inter-preted as a message flow from a partner to the process. Wemap it therefore to a <receive> activity. A directed edge

into the other direction, i.e., from a step vertex to a partnervertex is mapped to an <invoke> activity.

The process starts with the CreateItinerary step receiv-ing an order event message from the customer. By default,we always instantiate a new process instance. After themessage has been received, the trip request object is cre-ated, and hotel and flight request messages can be sent inany order to the two partner services. Consequently, both<invoke> activities are embedded into a <flow> activ-ity. Port type names are simple default strings and operationnames are derived from the (abbreviated) names of the in-teracting process elements. For example, Customer to Cre-ate Itinerary is abbreviated with CToCI, Create Itinerary toFlight Service is abbreviated with CIToFS, the control stepsare abbreviated with EVAL.

<sequence><receive partner="Customer"

portType ="pt1"operation ="CToCI"

createInstance ="yes"container ="OrderEvent">

</receive>

<flow><invoke partner ="HotelService"

portType ="pt2"operation ="CIToHS"

inputContainer ="HotelRequests"></invoke>

<invoke partner ="FlightService"portType ="pt3"

operation ="CIToFS"inputContainer ="FlightRequests">

</invoke></flow>

After the partner services have been invoked, the processwaits for the services to send the results of their bookingoperations, which again can arrive in any order.

<flow><receive partner ="HotelService"

portType ="pt4"operation ="HSToEVAL1"container ="HotelRequests">

</receive>

<receive partner ="FlightService"portType ="pt5"

operation ="FSToEVAL1"container ="FlightRequests">

</receive></flow>

After the answers have been received, the process hasto branch depending on whether the services were able tobook the requests or not. This introduces a first <switch>construct in the process. No condition for the switching canbe derived from the process model and therefore a default

placeholder is inserted. In the first branch (ConfirmItineraryto Customer), the process needs to inform the customerabout the successful completion of his reservation and sendsthe completed trip request documents. In the second branch(ReplanItinerary to Customer, Evaluate to HotelService orFlightService), it needs to inform the customer about thebooking failure and then decide which of the services has tobe informed about the failure of the other partner, i.e., an-other <switch> construct is nested that decides which ofthe services is informed.

<switch><case condition ="condition1"><invoke partner ="Customer"

portType ="pt6"operation ="ConIToC"

inputContainer ="TripRequest"></invoke>

</case><otherwise><sequence>

<invoke partner ="Customer"portType ="pt7"

operation ="RIToC"inputContainer ="BookingFailure">

</invoke><switch>

<case condition="condition2"><invoke partner ="HotelService"

portType ="pt8"operation ="EVAL2ToHS"

inputContainer ="BookingFailure"></invoke>

</case><otherwise><invoke partner ="FlightService"

portType ="pt9"operation ="EVAL2ToFS"

inputContainer ="BookingFailure"></invoke>

</otherwise></switch>

<sequence></otherwise>

</switch></sequence>

By default, we assume that all message exchanges areasynchronous. The BPEL4WS specification becomes rathercomplicated due to the two control steps contained withinthe process. Obviously, these could be merged into a single<switch> construct with three different cases by applyingBPEL4WS optimization techniques similar to the ones pre-sented in [14]. The transformation engine could then prop-agate such simplifications back to the business view modelwhere it could be discussed with the process designers.

An alternative transformation method (not shown in thispaper) generates an unordered set of <receive> and<invoke> activities within a single flow in the same wayas above. Then it adds <links> from an activity � to anactivity � whenever there is a path from the process step

interacting with � to the process step interacting with � inthe process graph.

4.2 Synthesis of State Machines

Another programming model that we want to supportwith our synthesis method are stateful business objects. So-called adaptive documents (ADocs) [10], are persistent dataobjects that carry their own processing instructions in theform of an associated (possibly nondeterministic) finite-state machine. When in a particular state, an ADoc caninvoke an activity and, depending on the result of the activ-ity execution, it will transit to one of the possible successorstates. The task of the synthesis process is to generate statemachine specifications for each of the discovered businessobjects.

In our example, we have discovered three different busi-ness objects, namely the trip request, the flight request, andthe hotel request. Each of them is processed by different ac-tivities reflected in the subprocesses and process steps of theinformation flows that we extracted from the process graph,see Figure 6. Now, we use this information to determine thestate machine of the objects.

We consider the flow graph of each object as a means ofdefining the words of a formal language �8 . The namesof the vertices are the symbols of the language, and theedges define the allowed concatenation of the symbols toobtain a word in the language �� . To simplify our nota-tion we abbreviate the activity names with single letters:CreateItinerary (c), Evaluate (e), HotelService (h), Flight-Service (f), ReplanItinerary (r), and ConfirmItinerary (o).The flow graph for the trip request object defines the words� / �� . For the hotel request object we obtain �� / , andfor the flight request object we have � / . We now constructthree finite state machines that accept these words.

The construction process defines a transition for eachgraph vertex and annotates it with the name of the vertexto indicate the activity that has to be invoked. For each tran-sition, a start and an end state are defined. When a processstep � is directly followed by a process step � in the graph,the end state of � is merged with the start state of � . Thisway, the state machines are built until the end state of thelast process step is marked as a final state of the machine oruntil a process step is revisited if the process graph is cyclic.In the latter case, a cyclic state machine is obtained and itcan happen that a state machine without end states is con-structed. In this case, a warning is issued as this would meanthat the business object is processed in an infinite loop thatnever terminates. Process steps, which are unordered withrespect to each other, introduce multiple successor statesand lead to nondeterministic state machines. There mustat least be one edge whose start state is not the end state ofanother edge, i.e., there must be one clearly identifiable first

activity in the process graph. This state is marked as theinitial state of the state machine, otherwise the constructionfails. Figure 8 shows the result of the construction. Thestate machines have the common symbols / and � , but eachmachine also has private symbols that the others do not read.

H0

TE1

F0

H1 H2 HE

F2F1 FE

T0 T1 T2

TE2

Trip Request

Flight Request

Hotel Request

common: {e,c}

private: {h,f,o,r}

c

c

c e

e

e

o

r

h

f

Figure 8. Three state machines for the discov-ered business objects.

5 Correctness of Model-Driven Transforma-tions

The correctness of each of the model-driven transforma-tions, which the TE implements, has to be shown separately.The basis for our correctness statements is the process graphthat can be seen as a normalized, formal representation of abusiness view model. As the mapping of a particular busi-ness modeling approach to the process graph is essentiallyheuristic, given the informally specified semantics of com-mon modeling approaches, we cannot make a precise for-mal argument about the relationship between the originalbusiness view model and the synthesized elements of the ITarchitectural model. However by assuming that our heuris-tic mapping is reasonable, we can carry over the correctnessresults in an informal way to the original input.

In the following, we sketch how the correctness of theBPEL4WS and ADoc synthesis can be established inde-pendently of each other. In a comprehensive transforma-tion tool, sets of BPEL4WSs and ADocs would be gener-ated as solution components of the IT architectural model.The architectural model will also contain other componentsand it has to determine how these components will inter-act, i.e., the component “wiring” should be made explicitin the model. A comprehensive correctness proof will theninvolve to consider each of the generated components aswell as their modeled interaction, which goes far beyondthe scope of this paper.

The BPEL4WS synthesis is correct if the synthesizedBPEL4WS specification, when executed, generates mes-

sage exchange traces of partner interactions as they havebeen specified by the process graph in the following sense:Each subprocess vertex in the process graph is consideredas representing a partner. A link from a subprocess vertexto a process-step vertex is interpreted as the receiving ofsome information from some partner, while a link from aprocess-step vertex to a subprocess vertex defines the send-ing of some information to the partner. The directed linksbetween the process steps define a (partial) ordering of thesending and receiving activities in the process. The possiblepaths through the process steps thus define a set of possibleconversation behaviors.

The structured activities in BPEL4WS (switch, se-quence, link) also define a partial ordering of the atomic<receive> and <invoke> activities. A complete exe-cution of the BPEL4WS specification, in which all branchesare exhaustively enumerated and where all possible inter-leaving orderings for the unordered activities are deter-mined, must generate exactly the set of possible conversa-tion behaviors described by the process graph. Our trans-formation method is correct if the process subgraph con-taining only process steps and no subprocesses (i.e., onlythe white nodes from Figure 7) and their linking edges isacyclic. Otherwise, we would generated cyclic BPEL4WSspecifications, which are not allowed. Note that the overallprocess graph containing subprocesses, which are mappedto BPEL4WS partners, can contain cycles.

The correctness criterion for the state machine synthesisis based on the fundamental relationship between regularlanguages and nondeterministic automata.

Take a process graph � � �� with process steps� and typed, directed links between the process steps.We consider the names of the vertices the symbols of alanguage � � � � . The directed links define an ordering be-tween the language symbols. We can formulate words overthe language � � � � , but we are only interested in the sub-language �� of words defined through the paths of thegraph � . Now take a set of ADoc state machines that wehave constructed for this process graph.

Definition 6 A nondeterministic finite state machine is a tu-ple

�� where

� �is a finite set of states,

� ��;� �is the start state,

� � �is the set of end states,

� is the input alphabet,

� �� is the transition function

Informally, a machine � accepts a word � � � if ittransits through a sequence of states �� &%&%&% ��=� when

“reading” the word. The states of the state machines de-scribe the possible states of the objects. A transition of astate machine takes a state � and an input symbol � andyields a new state �� . This means, when being in state � ,the machine � reads symbol � and transits into state � � .Alternatively, we can interpret this as follows: when object�5�� is in state � , the machine executes activity � and(upon completion of � ) it enters state �� . This way, we candefine the language � �� , which is accepted by the ma-chine � . Each of the state machines �� &%&%&% �� ' that weconstructed for the objects � � �&%&%&% � � ' has its own input al-phabet �� &%&%&%! ' determined from the activity names, whichprocess the object in the subflow graph. We can consider thetuple �"�� &%&%&% � ' � a distributed alphabet, which specifies theactivities a machine can perform. When two different inputalphabets share a symbol � , i.e., ��&�=6#+2 with 9�:� < , then� is called a synchronization activity. When a machine is ina state � that enables �%$�&�� as its only transition, then thismachine has to wait until all other machines that have � intheir input set are ready to read � .

The behavior of the combined machines can be describedwith the synchronized shuffle product [16], which is definedas follows:

Definition 7 Let �'� � �� (� �)� �� (� and �+* �� * ��* �,* �(* ��*�� be two finite state machines. The shuf-fle product � �-��/.%�+* � �� is definedas

� � � � �� *� �� (� ��* �� )�� ,*� �0�� (*� �� * �8��21 �43�5 ��6!7with

� � �� * � ��9 ��8999: 999;�"�� 9 � ��* ��* ��9 � � if 9 �<�� 6=(*�"�� 9 � ��* � if 9 �<��?>@(*�� * ��* ��9 � if 9 �<(*@>@��undefined otherwise.

The shuffle product of state machines is commutative,i.e., we can take our object state machines and shuffle themin any order to build the shuffle machine � �BA � � � . Thismachine accepts language L(M). With that, we arrive at thecriterion that our state machine synthesis needs to satisfy

�� meaning the object state machines together recognize thewords of the language defined by the process graph. In other

words, the object state machines can only invoke those ac-tivity sequences that have been defined by the business viewmodel.

Figure 9 shows the reachable states of the shuffle productmachine that combines the three object state machines. Themachine has � �� states, but only 8 of themare reachable. We can see that the machines synchronize onthe � and / activities, whereas the � and activities can beexecuted in any order. One can also observe that the shuffleproduct machine, when executed, will perfectly generate theworkflows one would normally implement for our exampleprocess graph, i.e., there is no need to additionally specifythese workflows.

HEFETE1

HEFETE2

H0F0T0 H1F1T1

H1F2T1H2F1T1

H2F2T1 HEFET2

c

f

f h

h

e

o

r

Figure 9. Synchronized shuffle product of theobject state machines.

Cyclic or nondeterministic transitions can be handledwith no problem and they inherit the usual semantics fromautomata theory. The only restriction that we currently es-tablish is the existence of a unique initial state and of at leastone end state in each state machine. The formalism also di-rectly applies to I/O automata models where we additionallydefine a set of output symbols and an output function, whicheither depends on the state only (for a Moore-type machine)or on the state and the input symbol (for a Mealy-type ma-chine), see for example [13]. The expressivity of these ex-tended machine models is the same as that of the standardmodel, but they can yield more compact representations.

We can always transform any process graph into a set ofstate machines whose shuffle product will accept the lan-guage defined by the graph. The correctness criterion thatwe based on the equivalence of two regular languages isdecidable. For deterministic automata, decision algorithmsof polynomial complexity exist, but in the case of non-deterministic automata, the decision problem is NP-hard.Whether our current constructive algorithm can guaranteethat a set of correct state machines is constructed is cur-rently an open question that we need to investigate further.

6 Related Work

Our work relates to many different fields in computerscience. The OMG initiative of model-driven architectures

has motivated several approaches that address the problemof model transformation. A good example of this work is[11], which—like us—also aims at supporting the completedevelopment cycle. In this approach, the focus lies on thedefinition of transformation schemes in UML, i.e., on an ex-plicit representation of how the objects used in one modelrepresentation are mapped to the objects used in anothermodel representation. As long as object-to-object mappingsare sufficient, explicit transformation schemes can be de-fined in a straightforward manner. However, when a morecomplex algorithmic analysis has to be performed before atransformation can take place, transformation schemes eas-ily reach their limits as the authors of [11] assert: “The de-tails of a transformation are only specified in terms of anunspecified procedural expression %&%&% ”.

In [12], a rule-based executable language to describemodel transformations is proposed that allows programmersto write arbitrarily complex transformation programs. Theexpressivity and generality of the approach constitutes alsoits main drawback—it is up to the programmer to make surethat his program performs correct transformations and thatthe underlying rule set is consistent. No tool support forconsistency checking is available, nor is it known whetherthe consistency of a given rule set is decidable.2

The work on transformations within the Kent Model-ing Framework [1, 6] comes very close to our approach.In [6], mappings between a specification model and a de-sign model are investigated. Both models are representedusing a subset of UML (in particular class diagrams andstate machines) and operations in the specification modelare mapped to sequences of states in the design model. Theauthors discuss the necessity to formalize the mapping andto prove its correctness, but cannot yet provide a solution tothese difficult problems. A very interesting contribution isthe set of benchmarks that is proposed to evaluate mappingapproaches and that we will consider in our ongoing work.In [1], a representation of mappings based on mathematicalrelations is proposed that is very suitable when the modelsare given in the form of classes, i.e., when sets of objectshave to be related to each other.

Our IT architectural models are based on process com-munication protocols (represented in BPEL4WS) and com-municating finite-state machines. This makes our work re-lated to the fields of protocol conversion [7], control theoryand the synthesis of controllers, see [13] for an example,and to the numerous approaches that model the behavior ofdiscrete systems or software programs based on the theoryof formal languages and automata. The synchronized shuf-fle product of automata, which we used to formally definea correctness criterion for our transformation method, is a

2An OMG standard is currently emerging that addresses the problem ofrepresenting queries, views, and transformations on UML2 (MOF) models,which will also become relevant to our future work.

recognized operation to describe sequentialized executionhistories of concurrent processes. Using automata theoryas the formal basis of our work has several advantages: weobtain automatic tools with well-defined properties, the dy-namic behavior of modeled processes can be precisely de-fined, and many interesting properties of our models can beverified.

7 Summary and Outlook

In this paper, we investigate model-driven transforma-tions that map business view models into IT architecturalmodels and vice versa. We analyze two approaches to busi-ness process modeling, ADF and UML2 activity models,and define a mapping of these models to so-called processgraphs, which possess a formal semantics and represent thestructure of a process. We use graph-theoretic methods toanalyze process graphs and to decompose them into struc-tured information flows. These flows are used to automati-cally synthesize business protocol specifications and adap-tive documents whose behavior is controlled by finite-statemachines.

Our methods would also carry over to more elaborateprocess models that capture the dynamic behavior of pro-cesses, but unfortunately, many business process modelingapproaches are not yet precise enough in their semanticswhen it comes to the representation of dynamic behaviors.Although the OMG defines a model as “a formal specifi-cation of the function, structure and/or behavior of a sys-tem” [9], formality in the sense of mathematical rigor iswhat is the hardest to achieve in most of the modeling ap-proaches. The UML2 specification even states that “Cur-rently, the semantics are not considered essential for thedevelopment of tools, however this will probably change inthe future.”, see [18], page 2-13. Our paper clearly showswhat can be achieved in model-driven transformations whenstate-of-the-art modeling approaches are used with care. Italso shows how essential semantics is for tools that try todo more than editing and syntax checking of models andhow much more would become achievable if the semanticsof dynamic processes— the spawning of subprocesses andtheir synchronization, the creation and destruction of ob-jects at runtime—would be formally captured. In this case,a truly semantics-preserving transformation between mod-els based on our methods is within reach.

References

[1] D. Akehurst and S. Kent. A relational approach to definingtransformations in a metamodel. In 5th International Con-ference on UML, volume 2460, pages 243–258. Springer,2002.

[2] E. Christensen, F. Curbera, G. Meredith, and S. Weer-awarana. The web services description language WSDL.http://www-4.ibm.com/ software/ solutions/ webservices/resources.html, 2001.

[3] F. Curbera et al. Business process exe-cution language for web services. www-106.ibm.com/developerworks/webservices/library/ws-bpel/, 2002.

[4] E. Deborin et al. Continuous Business Process Managementwith HOLOSOFX BPM Suite and IBM MQSeries Workflow.IBM Redbooks, 2002.

[5] Proceedings of the 6th International Enterprise DistributedObject Computing Conference. IEEE Press, 2002.

[6] S. Kent and R. Smith. The bidirectional mapping problem.Electronic Notes in Theoretical Computer Science, 82(7),2003.

[7] S. Lam. Protocol conversion. IEEE Transactions on Soft-ware Engineering, 14(3):353–362, 1988.

[8] F. Leymann and D. Roller. Production Workflow. PrenticeHall, 2000.

[9] Model-driven architecture - a technical perspective. Doc-ument from the Architecture Board MDA Drafting Team,2001.

[10] P. Nandi, S. Kumaran, T. Heath, K. Bhaskaran, and R. Dias.ADoc-oriented programming. In Proceedings of the Interna-tional Symposium on Applications and the Internet (SAINT-2003). IEEE Press, 2003.

[11] J. Oldevik, A. Solberg, B. Elvesaeter, and A. Berre. Frame-work for model transformation and code generation. InEDOC-03 [5], pages 181–189.

[12] M. Peltier. MTrans, a DSL for model transformation. InEDOC-03 [5], pages 190–199.

[13] A. Ramirez, C. Sriskandarajah, and B. Behabib. Controlof flexible-manufacturing workcells using extended mooreautomata. In IEEE International Conference on Roboticsand Automation, pages 120–125, 1999.

[14] N. Sato, S. Saito, and K. Mitsui. Optimizing compositewebservices through paralleliation of service invocations. InEDOC-03 [5], pages 305–316.

[15] A. W. Scheer, F. Abolhassan, W. Jost, and M. Kirchner.Business Process Excellence - ARIS in Practice. Springer,2002.

[16] A. Shaw. Software descriptions with flow expressions. IEEETransactions on Software Engineering, 4(3):242–254, 1978.

[17] Unified modeling language superstructure, version 2.0. 2ndrevised submission to OMG RFP ad/00-09-02, 2003.

[18] Unified modeling language infrastructure, version 2.0. 2ndrevised submission to OMG RFP ad/00-09-01S, 2003.