retracted: model-driven development of olap metadata for relational data warehouses

14
Model-driven development of OLAP metadata for relational data warehouses Jesús Pardillo, Jose-Norberto Mazón DLSI/Lucentia, University of Alicante, Spain abstract article info Article history: Received 29 November 2010 Received in revised form 14 July 2011 Accepted 14 August 2011 Available online 27 August 2011 Keywords: MDA OLAP Data warehouse Conceptual model Metadata We propose a model-transformation architecture with which to obtain both the database schema of a data warehouse and the required OLAP metadata for end-user tools to intuitively query the underlying schema. The main benet that this architecture provides is a set of mappings that will allow designers to effortlessly obtain various types of OLAP metadata for several kinds of tools and users during the simultaneous generation of the database schema. Both processes are automatic and integrated. As a proof of concept, our approach is implemented in the Eclipse Modelling Framework, thus showing its feasibility and usefulness. © 2011 Elsevier B.V. All rights reserved. 1. Introduction The term data warehouse is dened as a subject oriented, inte- grated, time variant and non volatile collection of data in support of a management's decision making process[7]. A data-warehouse system is composed of several layers (i.e., data sources, a database schema as a repository for the data warehouse, extract, transform, and load (ETL) processes to populate the database, and data analysis tools that are used to query the data warehouse) in which data from one layer is derived from data from the previous layer [8]. Data-warehouse development therefore aims to generate metadata that will permit integration and interoperability among these layers. It is important to note that the database and end-user tools are closely linked to each other, since the former provides the data schema that supports users' queries of the latter. A data warehouse may be characterised with a multidimensional model [7,9], which intuitively arranges the data used in decision mak- ing, i.e., the facts and dimensions involved in analysis. Owing to its da- tabase nature, data-warehouse design may be accomplished in three phases: conceptual modelling, and logical and physical design [20]. Interestingly, relational modelling is considered to be the most suit- able approach for developing a database for data warehouses [9]. On-line analytical processing (OLAP) tools are among the solutions most frequently used to take advantage of the multidimensional view- point, when the data warehouse is being queried. After conceptual modelling, two kinds of metadata need to be derived to implement a data warehouse on top of relational systems: a database schema, but also, the OLAP metadata which identies data as multidimensional elements in a recognisable format for end-user OLAP tools. Hence, data-warehouse design should deal with both kinds of metadata [18]. However, in research literature (Section 2) there is a surprising ab- sence of the derivation of OLAP metadata for end-user tools in a manner that is integrated with the corresponding database schemata (see Fig. 1). Moreover, current commercial solutions (Section 2) derive particular OLAP tool metadata from logical models in an ad hoc manner (lower part of the gure). This method is likely to be prone to failure and it is a waste of effort [5], and the data-warehouse risks and costs are thus increased. Our experience in developing real-world projects has led us to the conclusion that the reason for this is twofold: rst, OLAP metadata is designed solely from less-expressive models, thus sig- nifying that designers have to take the missing expressiveness into con- sideration manually, and second, the design of OLAP metadata is vendor-specic. Once the vendor has been chosen, both designers and decision makers are constrained to the platform-specic tools provided. The derivation of OLAP metadata poses some interesting research challenges, which have not yet been resolved [16]: this metadata should be derived (i) together with the database schema in an integrated man- ner, and should also (ii) take into account the metadata heterogeneity in terms of the different kinds of tools and vendors [19]. This article studies a method that solves the limitations identied in current practices, which was preliminary studied in [15]. This method deals with the integrated generation of OLAP and database metadata in a platform-independent manner, thus allowing the full potential analysis of any OLAP tool on the database structures deployed. Furthermore, this generation takes place automatically, which improves the tedious task of metadata denition and prevents the emergence of human errors. Our method is therefore also aligned Computer Standards & Interfaces 34 (2012) 189202 Corresponding author. Tel.: +34 965 90 37 72; fax: +34 965 90 93 26. E-mail addresses: [email protected] (J. Pardillo), [email protected] (J.-N. Mazón). 1 Please, consult Appendix A for a comprehensive list of acronyms in this text. 0920-5489/$ see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.csi.2011.08.001 Contents lists available at SciVerse ScienceDirect Computer Standards & Interfaces journal homepage: www.elsevier.com/locate/csi RETRACTED

Upload: jesus-pardillo

Post on 05-Sep-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Computer Standards & Interfaces 34 (2012) 189–202

Contents lists available at SciVerse ScienceDirect

Computer Standards & Interfaces

j ourna l homepage: www.e lsev ie r .com/ locate /cs i

Model-driven development of OLAP metadata for relational data warehouses

Jesús Pardillo, Jose-Norberto Mazón ⁎DLSI/Lucentia, University of Alicante, Spain

⁎ Corresponding author. Tel.: +34 965 90 37 72; fax:E-mail addresses: [email protected] (J. Par

(J.-N. Mazón).1 Please, consult Appendix A for a comprehensive list

0920-5489/$ – see front matter © 2011 Elsevier B.V. Alldoi:10.1016/j.csi.2011.08.001

a b s t r a c t D

a r t i c l e i n f o

T

Article history:Received 29 November 2010Received in revised form 14 July 2011Accepted 14 August 2011Available online 27 August 2011

Keywords:MDAOLAPData warehouseConceptual modelMetadata

We propose a model-transformation architecture with which to obtain both the database schema of a datawarehouse and the required OLAP metadata for end-user tools to intuitively query the underlying schema.The main benefit that this architecture provides is a set of mappings that will allow designers to effortlesslyobtain various types of OLAPmetadata for several kinds of tools and users during the simultaneous generationof the database schema. Both processes are automatic and integrated. As a proof of concept, our approach isimplemented in the Eclipse Modelling Framework, thus showing its feasibility and usefulness.

E

+34 965 90 93 26.dillo), [email protected]

of acronyms in this text.

rights reserved.

A

© 2011 Elsevier B.V. All rights reserved.

1. Introduction

The term data warehouse is defined as “a subject oriented, inte-grated, time variant and non volatile collection of data in support ofa management's decision making process” [7]. A data-warehousesystem is composed of several layers (i.e., data sources, a databaseschema as a repository for the data warehouse, extract, transform,and load (ETL) processes to populate the database, and data analysistools that are used to query the data warehouse) in which datafrom one layer is derived from data from the previous layer [8].

Data-warehouse development therefore aims to generate metadatathat will permit integration and interoperability among these layers. Itis important to note that the database and end-user tools are closelylinked to each other, since the former provides the data schema thatsupports users' queries of the latter.

A data warehouse may be characterised with a multidimensionalmodel [7,9], which intuitively arranges the data used in decision mak-ing, i.e., the facts and dimensions involved in analysis. Owing to its da-tabase nature, data-warehouse design may be accomplished in threephases: conceptual modelling, and logical and physical design [20].Interestingly, relational modelling is considered to be the most suit-able approach for developing a database for data warehouses [9].On-line analytical processing (OLAP) tools are among the solutionsmost frequently used to take advantage of the multidimensional view-point, when the data warehouse is being queried. After conceptualmodelling, two kinds of metadata need to be derived to implement a

RETR

data warehouse on top of relational systems: a database schema, butalso, the OLAP metadata which identifies data as multidimensionalelements in a recognisable format for end-user OLAP tools. Hence,data-warehouse design should deal with both kinds of metadata [18].

However, in research literature (Section 2) there is a surprising ab-sence of the derivation of OLAPmetadata for end-user tools in amannerthat is integrated with the corresponding database schemata (seeFig. 1). Moreover, current commercial solutions (Section 2) deriveparticular OLAP tool metadata from logical models in an ad hocmanner(lower part of the figure). This method is likely to be prone to failureand it is a waste of effort [5], and the data-warehouse risks and costsare thus increased. Our experience in developing real-world projectshas led us to the conclusion that the reason for this is twofold: first,OLAPmetadata is designed solely from less-expressivemodels, thus sig-nifying that designers have to take themissing expressiveness into con-sideration manually, and second, the design of OLAP metadata isvendor-specific. Once the vendor has been chosen, both designers anddecisionmakers are constrained to the platform-specific tools provided.

The derivation of OLAP metadata poses some interesting researchchallenges,whichhave not yet been resolved [16]: thismetadata shouldbe derived (i) together with the database schema in an integratedman-ner, and should also (ii) take into account the metadata heterogeneityin terms of the different kinds of tools and vendors [19].

This article studies a method that solves the limitations identifiedin current practices, which was preliminary studied in [15]. Thismethod deals with the integrated generation of OLAP and databasemetadata in a platform-independent manner, thus allowing the fullpotential analysis of any OLAP tool on the database structuresdeployed. Furthermore, this generation takes place automatically,which improves the tedious task of metadata definition and preventsthe emergence of human errors. Our method is therefore also aligned

C

Page 2: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Design Artefacts

System Metadata

Vendor Specific

ResearchApproaches

AUTOMATICAUTOMATIC

COMPUTER-AIDED

Multidimensional Modelling

• Intuitive: Visual & Concise

• Expressive: • Data Schema • OLAP Algebra

Data-cube MetadataOLAP Schema

Database MetadataRelational Schema

- Naming Conventions- Hierarchy Definitions- Additivity Specification . . .

ROLAP

Fig. 1. Workflows for designing data warehouses in relational systems.

190 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

D

with themodel-driven architecture (MDA) [14] in order to achieve theintegration and automation for metadata design and management,thus making the derivation of different kinds of both database sche-mata and end-user OLAP metadata feasible.

Fig. 2 shows a concept map with an overview of the technologiesinvolved in our proposal. The application domain is the OLAP toolsand data warehouse to be queried. Given the insight that the concep-tual model of a data warehouse also models the required OLAP meta-data (represented as the dashed line in the figure), the proposal is theactual automatic derivation of OLAP metadata from the conceptualmodel. It is done by means of MDA (see Section 3), from whichthree standard languages are used, each one for a specific task:meta-object facility (MOF) as meta-modelling language, query/view/transformations (QVT) for describing model-to-model mappings (thefirst transformation stage, Section 4) and Mof2Text for model-to-

MOFScript

Mof2Text

DataWarehouse

OLAPTools

MD@UML

ConceptualModel

QVT

query

OLAP Metadata

modelread

model

MOF

model mapping code generation

model

model

mediniQVT EMF

Ecl

ipse

MD

A

implement

code generationmodel mapping modelling

modelling

CW

M

CWM OLAP

UM

L

UML les

adapt

model

model

model model

Fig. 2. Overview of the technologies supporting the proposed design method.

RETR

code generation (the second one, Section 5). Moreover, MOF servesas scaffolding for UML and CWM, which respectively represent con-ceptual models, by means of a UML profile, namely MD@UML, andthe OLAP metadata themselves. All these languages are implementedunder the Eclipse development platform (Section 6), which providesspecific modules for each language: mediniQVT for implementingQVT, EMF implementing MOF, and MOFScript as the Mof2Textimplementation.

The remainder of this article is structured as follows. Section 2 dis-cusses related work. Section 3 describes the modelling frameworkinvolved and the model-transformation architecture proposed.Then, Section 4 presents the mapping between conceptual modelsand OLAP metadata, whilst Section 5 presents the mapping of OLAPmetadata into platform-specific code. Later, Section 6 outlines thedevelopment environment used to implement and validate the solutionfound. In the final section, conclusions regarding metadata managementare discussed and some future work is outlined.

2. Related work

Two of the most important approaches for designing data ware-houses are presented in [7,9]. Despite their seminal role, theseworks do not consider the development of the data warehouse froma conceptual perspective, and are therefore distanced from thescope of our work: the derivation of OLAP metadata from conceptualmodels and its integration with the traditional database design.

Therefore, in this article, we have preferred to focus on otherworks that consider the development of data warehouses from a con-ceptual level. In our research community, we believe that one of themost popular of these methods is described by Golfarelli et al. [4].For them, the development of the data warehouse consists of threephases: (i) extracting data from distributed operational sources,(ii) organising and integrating data consistently into the data ware-house, and (iii) accessing the integrated data in an efficient and flexiblemanner. Golfarelli et al. focus on the second phase by proposing ageneral method for data-warehouse design based on the dimensional-fact model (a particular notation for conceptual modelling). Once adimensional model has been defined, it can be mapped into a relationalmodel by defining the corresponding database structures, (tables,columns, and so on). The aim of the third phase is to define datacubes as OLAP metadata. However, although the authors point outthat this phase is extremely important for the correct use of OLAPtools, they do not provide any concrete solutions.

CTE

Page 3: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

191J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

A further interesting work is the one of Hüsemann et al. [6]. Theypropose a well structured approach to formalise the development ofthe conceptual model for database schemata based on a set of multi-dimensional normal forms. Hüsemann et al. describe a logical designphase with which to convert the conceptual schema into a logicalschema tailored to a target technology (relational or multidimen-sional), but they do not explicitly consider the metadata used byOLAP tools. Abelló et al.'s yet another multidimensional metamodel(YAM2) approach [1] presents a conceptual multidimensional modelfor the design of database schemata. This is an object-oriented modelwhich uses the unified modelling language (UML) [14] as a notation torepresent multidimensional data structures. However, Abelló et al.focus solely on the design of the database, and the question of how toderive metadata for OLAP systems is not addressed. Finally, Prat et al.[17] propose a design method which uses UML-based conceptualmodels as a starting point. They advocate the use of their model inend-user tools to provide decision makers with a standard, high-levelview of the data warehouse. Unfortunately, no mechanisms are providedto accomplish this task.

The situation with regard to commercial platforms is similar. Forexample, one of the most common data warehousing solutions inthe ORACLE platform2 consists of using a warehouse builder to designdata warehouses. After their design, OLAP analyses are carried out inthe ORACLE discoverer. In spite of the fact that ORACLE providesdesigners with mechanisms to integrate both tools, they are vendor-specific and poorly documented, and also require a certain amount oftedious post-processing (e.g., renaming data entities or defining aggre-gation hierarchies) in order to configure suitable end-usermetadata forthe underlying database.

To the best of our knowledge, the only work that addresses thegeneration of OLAPmetadata is proposed by Hahn et al. [5], which ex-plains how to derive database schemata together with configurationsfor OLAP tools from a conceptual multidimensional model. Their ap-proach is based on a conceptual model version that does not coverimportant modelling aspects such as non-strict hierarchies, many-to-many relationships between facts and dimensions, and so on.However, it directly generates vendor-specific metadata by adaptingthe conceptual model to the target system. The solution presented heredoes not have this drawback, since it takes advantage of a vendor-neutraldevelopment architecture aligned with MDA. In our architecture, an in-termediate metadata-interchange standard is employed, namely thecommon warehouse metamodel (CWM) [14], which permits the abstrac-tion of the underlying software technologies whilst the conceptual mul-tidimensional modelling provides expressive and intuitive primitives fordata-warehouse design.

To sum up, current approaches solely focus on obtaining databaseschemata from conceptual multidimensional models, and do not con-sider OLAP metadata in an integrated manner. However, databaseschemata in OLAP systems may not be sufficiently expressive to pre-serve all the information captured by conceptual multidimensionalmodels. OLAP metadata is thus required to facilitate the querying ofdata warehouses by OLAP tools. The proposed solution solves thisproblem, since it takes advantage of MDA for the derivation of boththe database and OLAP metadata in a systematic, integrated, andautomatic manner: whereas the derivation of database schematawas investigated in [11,13], the derivation of OLAP metadata wasinitially preliminarily studied in [15]. This article therefore presentsan extension of that study by providing comprehensive descriptionsof the following parts of our proposal:

• Implementation of the most meaningful and complex MDAmappings.

• Code generation process by means of model-to-code mappings.• Development environment as a proof-of-concept.

RETR

2 http://www.oracle.com/technology/products/index.html (July 2011).

Moreover, the readability of this extended version has been improvedby means of the addition of:

• A technological overview of the design method.• The supporting modelling theory and technologies.• Numerous examples to illustrate the many mappings involved inthe transformation process.

• A list of acronyms as shown in Appendix A.

3. MDA for the design of OLAP metadata

Themodel-driven architecture (MDA) [14] is an object-managementgroup (OMG) standard that addresses the complete life cycle ofdesigning, deploying, integrating, and managing applications by usingmodels in software development. MDA separates the specification ofsystem functionality from the implementation of that functionality ona specific technology platform by means of defining several viewpointson a system. A viewpoint on a system is a technique for abstractingaway details in order to focus on particular concerns within that systemand to establish a simplified model. MDA defines the following view-points: computation-independent, platform-independent, and platform-specific. Therefore, MDA encourages specifying a platform-independentmodel (PIM), which contains no information specific to the platform orthe technology that is used to realise it. This PIM can be transformedinto a platform-specific model (PSM) in order to include informationabout a specific technology. Afterwards, each PSM is transformed intocode to obtain the final implementation. On top of these models, MDAalso presents a computation-independent model (CIM) to specify userrequirements.

These MDA models can be developed using any modellinglanguage. However, languages that are compliant with the meta-objectfacility (MOF) [14] metamodel are typically used. Moreover, they maybe extended to define specialised languages for certain domains (i.e.,via metamodel extensibility or profiles).

The formal specification and deployment of multidimensionalmodels for data warehouses in our approach is based on MDA. Fig. 3shows an overview of the approach for data warehousing: a concep-tual multidimensional model of the data warehouse (as a PIM) isdeveloped (by following the approach presented in [10]) from aninformation requirements model (as a CIM) obtained from decisionmakers [12]. Various metadata can be derived from this PIM as PSMs,by taking into account different deployment platforms (relational,multidimensional, etc.). Finally, the code for the implementation ofthemultidimensional model according to each PSM is also obtained. In-terestingly, the CIM ismodelledwith a visual language for goal-orientedrequirement elicitation. Its aim is to manage the volatile informationrequirements by means of inferring them from decision makers' goalsthat should be more stable throughout the data-warehouse life. Never-theless, our contribution can be reused in other architectures in whichrequirements are modelled in a different manner.

It is important to note that our PSMs are represented in CWM[14]: the repository structures with the Relational package [13] andthe analysis structures with the OLAP package. We therefore employ avendor-neutral representation, i.e., it is independent of any specifictool, thus focusing solely on the problems involved in conceptual–logicalmapping. This is implemented by using the query/view/transformation(QVT) [14] language, which contains a declarative part that can beused to facilitate the design of model-to-model relations.

If we consider themost common data-warehouse scenario, i.e., a re-lational platform and an OLAP tool, then two kinds of PSMs are derivedfrom a PIM: one for the relational schema that models the persistentstorage of data cubes in terms of tables and columns, and another forthe end-user OLAP schemawhich preserves the dimensional modellingmetadata (i.e., facts, dimensions, aggregation hierarchies, etc.).

CTED

Page 4: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

INFORMATION REQUIREMENT MODEL

CONCEPTUAL MULTIDIMENSIONAL MODEL

CIM

PIM

DATABASEMETADATA 1

DATABASE

DATACUBESCHEMA 1

PSM

Code

. . . . . .

. . . . . .

METADATA NDATACUBE

METADATA 1DATACUBE

METADATA N

DATABASESCHEMA 1

DATABASESCHEMA N

DATACUBESCHEMA N

Fig. 3. Model-driven architecture for multidimensional design.

192 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

3.1. Conceptual multidimensional model as PIM

Many proposals concerning which multidimensional propertiesshould be modelled at the conceptual level exist (see Section 2).Importantly, conceptual multidimensional approaches must providea high level of abstraction if they are to expressively describe aspectsof both data structure and data analysis, which will thus be the basisof the subsequent implementation according to the chosen targettechnology. Several approaches have therefore been developed touse a widely-known language such as UML [14] in multidimensionalmodelling [10,1,17].

Our approach for the design of a conceptual multidimensionalmodel in a PIM is based on the approach presented in [10]. Fromhere on this will be denominated as MD@UML for convenience,since it is indeed multidimensional modelling (MD) at UML. This ap-proach allows a conceptual multidimensional model to define a col-lection of data types in order to cope with OLAP queries:

fact is the focus of analysis and acts as a container of measures.dimension is the context through which facts are analysed and inwhich they are aggregated.measure is the actual data to be aggregated, usually numericalvalues.level is the grouping criteria with which measures are aggregated,resulting in coarser data cubes. In practice, they are containers ofdescriptors from which hierarchies are formed.cube is an analysis structure in which measures are aggregated inseveral dimensions. In practice, facts are storage structures forquerying data cubes, facts being no more than the finest cube forthat dimensionality.descriptor is the actual data being used in practice as grouping criteriato aggregate measures, which serve as an alternative means to repre-sent the levels involved.

MD@UML is implemented as a UML profile in which each of theaforementioned data types is mapped into a specific UML metaclassaccording to Table 1 (only relevant data types are listed). This map-ping translates multidimensional concepts into UML class diagrams,i.e., classes, properties, and associations concerning the object-oriented

RETR

Table 1Mapping from multidimensional concepts to UML metaclasses.

Multidimensional Iconography UML

Fact, dimension, base ClassFact attribute, descriptor, dimension attribute FA, D, DA PropertyRoll up Association

semantics of UML. Moreover, MD@UML takes advantage of other UMLmetaclasses (without stereotyping) such as those that model inheri-tance hierarchies and packaging for the sake of completeness.

In order to illustrate this, let us consider the following runningexample, which is inspired by that presented in [3]. Let us assumethat the multidimensional conceptual model for the automobile-sales domain shown in Fig. 4 has been specified with MD@UML inaccordance with the guidelines of [10]. This diagram represents aconceptual model of the auto-sale fact and is used to analyse automo-bile sales by means of the following measures: the quantity sold, theprice, and the total amount. The sales are analysed around severaldimensions that describe a sale: auto, time (tagged as a temporaldimension with {isTime}), dealership, salesperson, and customer. Eachdimension contains several aggregation hierarchies to enable roll-upand drill-down OLAP operations [2] on sale data cubes. For instance,a customer may be viewed from different aggregation levels: customerdata may be used to aggregate automobile sales by cities, and cities byregions, and regions by states.3 In order to provide shortcuts, sales mayalso be aggregated by two alternative paths, i.e., (i) without consideringregions (city alternative), or (ii) without considering cities (regionalternative). In addition,when representing the dimensions themselvesin an OLAP analysis, any aggregation level exposes many dimension at-tributes such as the customer's date of birth or identifiers such as the cus-tomer's name.

CTED

3.2. OLAP schemata as PSM

CWM [14] is a standard for information management which isspecifically oriented towards metadata interchange, regardless ofspecific vendors. Moreover, CWM provides metadata for almost anykind of metadata required by data warehousing projects. Sincethese may be extremely complex, this complexity is also reflectedby CWM, which is decomposed into several packages. In particular,CWM contains the OLAP package (shown in Fig. 5) which modelsthe end-user OLAP metadata for querying data warehouses.

In the OLAP package, metadata are stored in schemata. Each schemais organised by means of cubes and dimensions. Each cube also has cuberegions and member selection groups that configure its granularity anddimensionality. Moreover, cubes have cube dimension associations torelate them to the required dimensions. Each OLAP dimension alsohas aggregation levels that define aggregation hierarchies which arerelated to each other by means of hierarchy level associations. CWMtherefore provides metadata that may be interpreted by compliant

3 In these diagrams, an entire aggregation path is identified by a unique name whoseends are labelled with r (d) to specify the roll-up (drill-down) direction.

Page 5: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Fig. 4. The conceptual multidimensional model of automobile sales with MD@UML.

Fig. 5. Excerpt of the OLAP package of CWM to manage vendor-neutral metadata.

4 They are available at http://lib.jesuspardillo.com/2011/csi/MDProfile2OLAP.qvt(July 2011).

193J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

ACTED

OLAP tools or smoothly adapted in order to embrace particular vendorsolutions.

In addition, CWM contains other packages which support OLAP-metadata definition, in particular fact measures and dimension de-scriptors: the core package in which primitive data types (attributes,integers, etc.) are defined and the key indexes package which modelsdata constraints. As an example of this dependency, let us supposethat the measures in CWM specialise CWM core attributes whilstthese attributes also model level descriptors. Figs. 6 and 7 serve as ex-amples of this. These figures represent the OLAPmetadata counterpartof the auto-sales fact in Fig. 4.

The auto-sales OLAP schema is shown in Fig. 6. It has only one cubefor recording auto-sales, characterised by the quantity, price, and total at-tributes acting as cube measures. With regard to its dimensions, it alsohas a unique key with four attributes (one for each dimension) andfour cube-dimension associations. Since OLAP cubes are specified at aparticular level of granularity, the auto-sales cube has a cube region de-fined for the lowest granularity modelled in Fig. 4, i.e., auto, dealership,time (meaning hours), and customer data. Note that CWM also requiresthat these cube regions replicate the related cube attributes (by refer-ence) in order to assist in their management by certain OLAP tools.

As an example, Fig. 7 shows the CWM metadata used to describethe customer OLAP dimension. There is one level for each conceptualbase: customer data, city, region, and state, each one of which containsits corresponding features, e.g., city has an attribute name as itsunique key and population as a regular attribute. The customer OLAPdimension also replicates the attributes of all its aggregation levels.In this way, CWM thus enables OLAP tools to manage a flatteneddescription of OLAP dimensions. There are also level-based hierar-chies for each conceptual aggregation path, denominated as standard,

RETR

city alternate, and region alternate. Similarly to OLAP dimensions, eachhierarchy also owns a reference to all the attributes of its aggregationlevels. Finally, the hierarchies contain a sequence of associations withtheir hierarchy levels in which level attributes are again replicated(this scatter of attributes is discussed in greater detail below).

The following section presents the mapping of MD@UML andCWM in order to articulate the derivation of end-user OLAPmetadata.

4. Obtaining OLAP metadata from conceptual models

A set of mappings has been developed to derive an OLAP PSM fromthe multidimensional PIM. In these mappings, V denotes {v1,v2,…,vn}for any symbol. When the element names of the two domains in-volved collide, vc stands for the conceptual source element and vofor the OLAP target element.

The mappings are also accompanied by various examples. They areconcisely presented as patterns ‘vc↦vo’ denoting a metadata-mappingexample of vc into vo, or ‘vc↦

n vo’ when referring to the mapping shownin Fig. n. Both metadata are described as ‘v∈T’, where T is the metadatatype. In addition, T may be described by a property p of some v′ as‘p(v′)’. In these cases, the type T′ of p(v′) is also specified as ‘p(v′)pT′’.

These mappings have been additionally formalised and imple-mented in QVT.4 QVT consists of two parts: declarative and impera-tive. The declarative part provides mechanisms to define relationsthat must hold between the model elements of a set of candidatemodels (source and target models). A set of these relations (or trans-formation rules) defines a transformation between models. The

Page 6: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Fig. 6. OLAP metadata in CWM for the Autosales cube.

CustomerDataHLA : HierarchyLevelAssociation

RegionHLA : HierarchyLevelAssociation

RegionAlternate : LevelBasedHierarchy

StateHLA : HierarchyLevelAssociation

CityHLA : HierarchyLevelAssociation

CityAlternate : LevelBasedHierarchy

CustomerDataHLAKey : UniqueKey

CustomerDataBornDate : Attribute

CustomerDataBornDate : Attribute

Standard : LevelBasedHierarchy

CustomerDataName : Attribute

CustomerDataName : Attribute

CustomerDataKey : UniqueKey

Customer : OLAPDimension

RegionHLAKey : UniqueKey

CustomerKey : UniqueKey

CityPopulation : Attribute

CityPopulation : Attribute

StateHLAKey : UniqueKey

StandardKey : UniqueKey

CityHLAKey : UniqueKey

RegionName : Attribute

RegionName : Attribute

RegionKey : UniqueKey

CustomerData : Level

CustomerData : Level

StateName : Attribute

StateName : Attribute

Population : Attribute

Population : Attribute StateKey : UniqueKey

CityName : Attribute

CityName : Attribute

BornDate : Attribute

BornDate : Attribute

CityKey : UniqueKey

Name : Attribute

Name : Attribute

Name : Attribute

Name : Attribute

Name : Attribute

Name : Attribute

Name : Attribute

Name : Attribute

Region : Level

Region : Level

State : Level

State : Level

City : Level

City : Level

/memberSelection

/ownedElement

/ownedElement

/currentLevel

/currentLevel

/currentLevel

/currentLevel

/feature

/dimension

/feature

/feature /uniqueKey

/feature

/uniqueKey

/feature

/uniqueKey

/dimension

/feature

/feature

/feature

/feature

/feature

/uniqueKey

/feature

0/feature /feature

/uniqueKey

/feature

/feature

1

/feature

3

/feature

/feature

/owner

/feature

/uniqueKey

/feature

2

/owner

/feature

/feature

/feature

/feature

/owner

/feature

/feature

/feature

/feature

/feature

/feature

/hierarchy

Fig. 7. Details on OLAP metadata in CWM for the Customer dimension.

194 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

RETRACTED

Page 7: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

C E

M1 M2

<<domain>>

QVTRelationExample

: SourceClass1

<<domain>>

where

when

: SourceClass2

: TargetClass1 : TargetClass2

: TargetClass3

OCL or relation invocation expressions;

OCL or relation invocation expressions;

Fig. 8. Example of a graphical QVT relation.

195J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

declarative part of QVT can be split into two layers according to thelevel of abstraction: the relational layer that provides graphical andtextual notation for a declarative specification of relations, and thecore layer that provides a simpler, but verbose, way of defining rela-tions. The imperative part defines operational mappings that extendthe declarative part with imperative implementations when it isdifficult to provide a purely declarative specification of a relation.

Herein, we focus on the relational layer of QVT. This layer supportsthe specification of relationships that must hold between MOFmodels by means of a relations language. A QVT relation (see Fig. 8)is defined by the following elements:

Two or more domains Each domain is a distinguished set of elementsof a candidate model (source or target model). This set of elements(denoted by a bbdomainNN label, see Fig. 8) must be matched inthat model by means of patterns. A domain pattern can be consid-ered as a template for elements, their properties and their associa-tions that must be located, modified, or created in a candidatemodel in order to satisfy the relation. A relation between domainscan be marked as check-only (labelled as C) or as enforced (labeledas E). When a relation is executed in the direction of a check-onlydomain, it is only checked if there exists a valid match in themodel that satisfies the relationship (without modifying anymodel if the domains do not match); whereas for a domain that isenforced, when the domains do not match, model elements are cre-ated, deleted, or modified in the target model in order to satisfy therelationship. Moreover, for each domain the name of its underlyingmetamodel is specified (labels M1 and M2 in Fig. 8).When clause This clause specifies the condition under which the rela-tion needs to hold (i.e., it forms a precondition). This clausemay con-tain arbitrary object-constraint language (OCL) [14] expressions inaddition to the relation invocation expressions (see Fig. 8).Where clause This clause specifies the condition thatmust be satisfiedby allmodel elements participating in the relation (i.e., it forms a post-condition). This clause may also contain OCL expressions or relationinvocation expressions.

The designed QVT relations and the dependencies among them areshown in Fig. 9. Each relation has been referenced by means of themodelling elements that it maps, e.g., Fact2Cube maps a conceptualFact in the PIM into a logical Cube in the PSM. However, each relationdepends onmany others to complete the overall mapping. For instance,in the case ofmapping facts into cubes, it is necessary tomap conceptualmodels into OLAP schemata by applying the Model2OLAPSchema rela-tion. In addition, QVT allows us to specify which relations are theentry points to start the transformation process bymeans of identifying

RETR

them as each top relation type. Hence, during the transformation pro-cess, each top relation calls its depending non-top relation to completethe remainingmappings involved. Each non-top relation also recursive-ly calls the depending non-top relations. As a result of the transforma-tion, the OLAP metadata in our running example were automaticallyderived in CWMas is shown in Figs. 6 and 7. Please, refer to thesefiguresduring the examples provided below.

4.1. Mapping multidimensional models and OLAP schemata

An entire conceptual model m is mapped into an OLAP schema s.

Example 4.1. Autosales∈Model↦Autosales∈Schema.

From this mapping, the multidimensional modelling elements con-tained inm are thenmapped into their OLAP-metadata counterparts in s.

4.2. Mapping facts and measures into OLAP cubes

Given the previous mapping, every conceptual fact inm is mappedinto an OLAP cube in s.

Example 4.2. Autosale∈Fact↦Autosale∈Cube.

Mapping facts also implies another mapping to provide the uniquekeys of the cube, which identify the data cells based on the mappeddimensions Do.

Example 4.3. Autosale∈Fact↦AutosaleKey∈UniqueKey;Auto∈Dimension↦AutoKey∈(feature(AutosaleKey)pAttribute).

For each conceptual fact attribute, a cubemeasure is created in CWM.

Example 4.4. Quantity∈FactAttribute↦Quantity∈Measure.

However, metadata for OLAP cubes also require the specificationof their cube regions, defining Do and member selection groups whichdefine their granularity. Therefore, a LowestLevels region would bemapped by selecting the lowest aggregation level of each do∈Do,e.g., the CustomerData level in the Customer dimension of Fig. 4.Finally, each dimension employed in a cube is linked by means ofthe corresponding cube dimension association.

4.3. Mapping dimensions and aggregation levels

For each conceptual dimension dc in m, its OLAP metadata do ismapped into s. Also, every conceptual aggregation level lc in dc ismapped into its corresponding OLAP metadata lo in do.

CTED

Page 8: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

«top» Model2OLAPSchema

«top» Fact2Cube

FactAttribute2CubeRelatedMeasure

«top» Fact2CubeRegion

«top» Fact2MemberSelectionGroup

«top» Dimension2OLAPDimension «top» Base2Level

Descriptor2OLAPDimensionAttribute

«top» Aggregation2CubeDimensionAssociation

«top» Aggregation2CubeAttribute

«top» Aggregation2CubeRegionAttribute

«top» Aggregation2MemberSelectionGroupLevelLink

«top» UpperRollsUpTo2LevelBasedHierarchy

Descriptor2LevelBasedHierarchyAttribute

RollsUpTo2HierarchyLevelAssociation

DimensionAttribute2OLAPDimensionRelatedAttribute

Descriptor2LevelRelatedAttribute

«top» Dimension2OLAPDimensionUniqueKey

«top» Fact2CubeUniqueKey

Fig. 9. Dependency graph of the QVT relations implementing the OLAP mappings.

196 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

Example 4.5. Customer∈Dimension↦Customer∈OLAPDimension;City∈Base↦City∈Level.

On the one hand, each dimension attribute a in lc is mapped into aCWM attribute belonging to lo and into another attribute related to do.

Example 4.6. Population∈DimensionAttribute↦Population∈(feature(City)pAttribute);Population∈DimensionAttribute↦CityPopulation∈(feature(Customer)pAttribute).

On the other hand, each identifier ic in lc is also mapped into anattribute io within unique keys for both lo and do.

Example 4.7. Customer∈Dimension↦CustomerKey∈UniqueKey;CityName∈(feature(CustomerKey)pAttribute);CityPopulation∉(feature(CustomerKey)pAttribute).

This mapping also establishes a CWM attribute (isTime) for temporaldimensions, in order to facilitate time-dimension management by OLAPtools.

Fig. 10. Base2Le

RETR

The QVT relation Base2Level has been defined for aggregation levels(see Fig. 10). It transforms one Base class in the source model into a Levelin the target model which already belongs to an OLAPDimension, since ithas previously been mapped by the Dimension2OLAPDimension of thewhen clause. Once this relation holds, several other relations must be exe-cutedaccording to thewhere clause inorder toobtainAttributesof the Level.

Example 4.8. State∈Base↦10 State∈Level.

One of the QVT relations used to obtain these Attributes is the Descrip-tor2OLAPDimensionAttribute relation (see Fig. 11). This checks aDescriptorproperty of a Base class in order to enforce an Attributewithin an OLAPDi-mensionwhich takes part of a UniqueKey. ThisUniqueKey should be creat-ed previously according to the precondition of the when clause.

Example 4.9. Name∈ ownedAttribute Cityð ÞpDescriptorð Þ↦11CityName∈(feature(Customer)pAttribute).

4.4. Mapping aggregation hierarchies

All of the conceptual hierarchy is mapped into a level-based hierarchyin which all attributes Ac∪ Ic of each aggregation level lo are also mapped

CTED

vel relation.

Page 9: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Fig. 11. Descriptor2OLAPDimensionAttribute relation.

197J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

A

D

in the samemanner as dc (togetherwith their corresponding unique keysfrom Ic). Moreover, OLAP metadata store a hierarchy level association hla,which links and sorts every lo into the aggregation-hierarchy metadata.Each hla also includes Ac∪ Ic and a unique key from the related lo.

Example 4.10. RegionAlternate∈(hierarchy(Customer)pLevelBasedHierarchy)where (CustomerData, Region, State)=(hierarchyLevelAssociation(Standard)pHierarchyLevelAssociation)

In order to map aggregation hierarchies we have defined the QVTrelation UpperRollsUpTo2LevelBasedHierarchy (see Fig. 12). Thisholds if a Rolls-upTo association between two Base classes ismatched in the multidimensional PIM. According to thewhen clause,

Fig. 12. UpperRollsUpTo2Leve

RETR

this Rolls-upTo association is connected to a Base class which is a ter-minal dimension level (i.e., it represents the finer aggregation levelin the hierarchy). A LevelBasedHierarchy is then created in the PSMtogether with other elements (see Fig. 12) as the correspondingmetadata used for data analysis. This relation is executed once theremaining conditions of the when clause have been satisfied. Once therelation has been executed, the conditions of the where clause arechecked. A set of QVT relations is then executed to match severalkinds of attributes and also to match the remaining hierarchy levels, ifnecessary.

Example 4.11. Standard∈RollsUpTo↦12 Standard∈LevelBasedHierarchy;Standard∈RollsUpTo↦12 CountryHLA∈HierarchyLevelAssociationwhere Country∈(type(memberEnd(Standard))pBase).

CTE

lBasedHierarchy relation.

Page 10: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Fig. 13. RollsUpTo2HierarchyLevelAssociation relation.

198 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

AED

Relation RollsUpTo2HierarchyLevelAssociation (see Fig. 13) checksthis last issue. This is outlined as follows: the source datamodel consistsof a set of elements from the PIM that represent a Rolls-upToassociation between two Base classes: one at the end of the role r (rb)and the other at the end of the role d (db). This set of elements mustbe matched with the following elements according to the CWM OLAPmetamodel: a LevelBasedHierarchy which owns a UniqueKey and is re-lated to a HierarchyLevelAssociation dhla. This dhla is related to a Level.

Example 4.12. Standard∈RollsUpTo↦13 StateHLA∈HierarchyLevelAssociationwhere State∈(type(memberEnd(Standard))pBase).

4.5. Mapping dimension attributes into level-based attributes

Dimension-related attributes Ac∪ Ic that are involved through theprevious mapping also require an explicit mapping into their metadatalevel-based counterparts Ao∪ Io. Specifically, a dimension attribute ac ismapped into a CWM attribute ao for the corresponding metadata con-tainer, i.e., levels, hierarchies, and so on. In addition, a level identifier icis mapped into an io within the unique key of the related container inorder to identify its entries. Since these mappings are spread overmany others, Table 2 shows an example that lists the mappings foreach of the CWM data types involved in the Customer dimension. Final-ly, with regard to data-type mappings, each conceptual type is directly

ETR

Table 2Mapping Name attribute of Region and State levels of the Customer dimension.

Logical OLAP metadata Conceptual

Attribute Container CWM Type LevelRegionName Customer Dimension RegionStateName Customer Dimension StateRegionName RegionAlternate LevelBasedHierarchy RegionStateName RegionAlternate LevelBasedHierarchy StateName RegionHLA HierarchyLevelAssoc. RegionName StateHLA HierarchyLevelAssoc. StateName Region Level RegionName State Level State

R

mapped into its CWM counterpart (note that these are omitted inFig. 4 for the sake of clarity).

In short, the designed mappings can be used to obtain the end-user OLAP metadata to query the database schema. This databaseschema may be derived by applying the model mappings presentedin [13], which are also supported by CWM and aligned with MDA.

It is worth noting that OLAP metadata must refer to the (relation-al) database schema, e.g., in order to identify which columns containmeasures. However, for the sake of simplicity, the solution presentedhere assumes an implicit mapping between both metadata based onthe matching of element names. For instance, if there is OLAPmetada-ta for an Auto-sale cube in CWM, it is assumed that there is also anAuto-sale table in which the former is deployed. Interestingly, this iseasy to assume, since both kinds of metadata are generated fromthe same conceptual multidimensional model.

5. Obtaining code for OLAP metadata

CWM plays the role of an intermediate language for the purposesof interchange. It facilitates reuse, acting as a metadata hub in themodelling architecture. However, some tools are not yet fully CWMcompliant (adding, removing or re-interpreting metaclasses). Anadditional customisation step should be carried out to deal withthese. Since these variations are usually codified in text files (ratherthan UML-like models), additional languages with which to derivethe final code are needed. MDA proposes the Mof2Text [14] in thesecases.

Mof2Text is a template language used to define mappings fromMOF models, such as UML or CWM, into code.5 Mof2Text modelsthe target code as a template: a textual specification parameterisedby the data of the source metamodel. Thus, a Mof2Text template is re-lated to certain source metaclasses by which the target code is para-meterised. We have specifically selected Mondrian code to illustratethe whole metadata derivation process. Mondrian is an open-sourceOLAP server which is part of the Pentaho Business Intelligent Suite.6

It is therefore a suitable candidate to exemplify a mapping between

CT

5 MOF is the metamodel used to define all OMG modelling languages.6 http://www.pentaho.com (July 2011).

Page 11: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

199J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

standard vs. specific OLAP metadata. The following is an example ofan Mof2Text template, which codifies the (CWM-to-Mondrian)Level2Level mapping:

cwm.Level::toLevel() {

var a : cwm.Attribute = self.feature

→select(f : cwm.Attribute | not f.uniqueKey.isEmpty()).first();

var asp : List = self.feature

→select(f : cwm.Attribute | f.uniqueKey.isEmpty());

tab(3) ‘bLevel name = "’ a.name ‘" column = "’ a.name ‘ uniqueMembers = "true"’

if (asp.isEmpty()) { ’/N’ newline(1) }

else { ‘N’ newline(1) }

asp→forEach(a : cwm.Attribute) { a.toProperty() }}

A

This mapping selects the identifying attribute for the source level,i.e., the conceptual descriptor CWM-counterpart, and promotes it to aMondrian level. This is owing to the fact that Mondrian levels are con-ceived as descriptors, whereas CWM levels are obviated and their at-tributes are rendered as Mondrian level properties (by means of theAttribute2Property).

The entire mapping is decomposed into the templates shown inFig. 14 on which their dependencies are also represented.7 Notehow simple this mapping (which specifically contrasts to Fig. 9) is:the dependency graph is actually quite a simple tree on which thetemplates are almost one-to-one mappings (e.g., CWM measures toMondrian measures), owing to the inherent complexity of CWMmetadata. Since CWM abstracts the best practices that almost anyOLAP model contains, CWM explicitly renders every OLAP conceptin order to cope with this diversity. For example, whereas conceptualmodels represent descriptors in their own aggregation level, CWMmetadata store them as attributes of levels along with dimensionsand aggregation levels, with the consequent increase in overload.

6. Development environment

The proposed model-transformation architecture has been imple-mented in the Eclipse development platform.8 Nowadays, Eclipse hasa large open-source community, which supports many different pro-jects related to software development. What began as a Java integrat-ed development environment, now contains over 60 projectsnowadays organised across many development technologies. Whatalso makes Eclipse powerful is precisely its capabilities to be easilyextended, since it was modularly designed around the notion ofplug-in, which has been thoughtfully applied to decompose themain components of the Eclipse development environment.

Many projects have been added to the Eclipse core in order to sup-port model-driven engineering, and particularly, MDA standards: e.g.,the model-development tools (MDT) codifying UML, the Eclipse Model-ling Framework (EMF) codifying CWM along with any other MOF-based metamodel, mediniQVT codifying declarative QVT relations,9

or MOFScript codifying Mof2Text templates, just to mention a fewwidely-used examples. A prototype for the design of OLAP metadataaccording to MDA may be supported by the Eclipse platform inthese projects.

Our prototype provides a specific development project for data-warehouse design (including database design [13]). This projectguides designers through each design phase on which a set ofmodel mappings are incrementally applied in order to eventually ob-tain the platform-specific metadata.

RETR

7 Its implementation is available at http://lib.jesuspardillo.com/2011/csi/MD2Mondrian.m2t (July 2011).

8 http://www.eclipse.org (July 2011).9 http://projects.ikv.de/qvt (July 2011).

Eclipse offers designers with the corresponding editors for bothUML-based and CWM models. MD@UML editor has been designedto render the intended notation on UML diagrams. These are sup-ported by specific palette tools which provide designers with multidi-mensional modelling elements to be instantiated. CWM editor hasbeen designed by codifying the CWM metamodel with EMF, whichresults in a versatile CWM editor that EMF can automatically derivefrom the metamodel specifications. The model transformations havebeen included as menu options that can be invoked whenever amodel is ready to be translated. There is thus one transformation foreach set of mappings designed, whether the target schema be a data-base or an OLAP schema.

To return to our running example, Fig. 15 shows two diagrams ofthe conceptual model of the example scenario (represented in Fig. 4)in Eclipse. The procedure consists of configuring a data-warehousingproject (left-hand side) with one folder for each MDA layer, namelyCIM, PIM, PSM, and code. Specifically, this figure shows that the PIMis divided into several views for dimensions (auto, customer, etc.)and facts (auto-sale). Eclipse codifies them into two kinds ofmetadata:diagrams (marked with the extension md) and abstract syntax(UML). Complex models are therefore easy to manage in large datawarehousing projects.

Fig. 16 shows the specification of the transformations involved(see Section 3), namely, PIM-to-PSM (centre) and PSM-to-code(right-hand side). These specifications serve as input for the corre-sponding transformation engines (mediniQVT for the first, and MOF-Script for the second) which also require the source models in orderto derive the metadata. It is worth noting that the output of the QVTengine is also a model of the transformation trace (shown in the fold-er engine). This is useful because it supports the differential code der-ivation by transforming only the changes in source models ratherthan in entire models.

Finally, Fig. 17 presents the end-user OLAP metadata obtained byapplying the previous mappings in order to transform the models in-volved. Both kinds of metadata (for CWM and Mondrian) are codifiedin XML. Whereas CWM models (stored in the PSM folder, shown inthe centre) are accessed by the build-in tree editor (as Fig. 17shows), Mondrian metadata (stored in code, shown on the right-hand side) is directly rendered as text files. Nevertheless, since theyare designed to be managed by OLAP tools, the utility of their actualcodification is limited to the marginal cases in which direct accessmay be necessary.

7. Conclusions

Nowadays, OLAP metadata is not correctly integrated into thedata-warehouse development process since both end-user and data-base metadata should be managed simultaneously in OLAP systems.Moreover, whenever this occurs, the design method is focused on avendor-specific solution. This thus involves tedious tasks to dealwith metadata integration and technology heterogeneity, both ofwhich increase development costs.

To overcome this situation, data-warehouse developers should beallowed to focus on the high-level description of the system ratherthan low-level and tool-dependent details. In this article the model-transformation architecture presented in [13] has been extended inorder to consider the automatic generation of end-user OLAPmetada-ta in a manner which is integrated with the underlying databaseschema. The mappings involved are developed and aligned withMDA by using CWM as a standard for the metadata interchange fordata warehousing. OLAPmetadata can therefore be designed in an au-tomatic, integrated, and vendor-neutral manner. As a proof of con-cept, our research has been implemented in an open-sourcedevelopment platform, thus demonstrating the feasibility of the auto-matic end-user metadata derivation for OLAP tools from conceptualmultidimensional models.

CTED

Page 12: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

OLAPSchema2Schema

Cube2Cube OLAPDimension2Dimension

LevelBasedHierarchy2Hierarchy

OLAPDimension2DimensionUsage

Measure2Measure

Level2Level Attribute2Property

Fig. 14. Dependency graph of Mof2Text templates from CWM to Mondrian.

200 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

Various research lines have arisen from this work. Importantly,OLAP metadata can be made more complex by extending the solutionto (i) emerging business-intelligence architectures such as those re-lated to spatial or real-time data warehouses [18], and (ii) persona-lised OLAP databases in which end-user metadata must be adaptedto contextual factors. In addition to metadata for OLAP, other data-analysis capabilities may also benefit from similar solutions to thosepresented here. For example, mapping conceptual data-miningmodels [21] into metadata related to data-mining tools may be equal-ly helpful for advanced decision makers.

Appendix A. List of acronyms

CIM Computation-independent modelCWM Common warehouse metamodelEMF Eclipse Modelling FrameworkETL Extract, transformation, and load (process)MD Multidimensional modellingMDA Model-driven architectureMDT Model-development tools

Fig. 15. Conceptual multidimensio

RETRA

MD@UML Multidimensional-modelling UML profileMOF Meta-object facility metamodelMof2Text MOF models to text transformation languageOCL Object-constraint languageOLAP On-line analytical processingOMG Object-management groupPIM Platform-independent modelPSM Platform-specific modelQVT Query/view/transformation (language)UML Unified modelling languageXML Extensible markup languageYAM2 Yet another multidimensional metamodel

Acknowledgements

The work of Jesús Pardillo is funded by the Spanish Ministry ofEducation and Science under FPU grant AP2006-00332. This researchhas been also supported by the Spanish projects: ESPIA (TIN2007-67078) from the Spanish Ministry of Education and Science, DEMETER(GVPRE/2008/063) from the Valencia Government, and QUASIMODO

TED

nal modelling within Eclipse.

C

Page 13: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

Fig. 17. Derived end-user OLAP metadata for both CWM and Mondrian within Eclipse.

Fig. 16. Specification of the QVT and Mof2Text relations in mediniQVT and MOFScript.

201J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

RETRACTED

Page 14: RETRACTED: Model-driven development of OLAP metadata for relational data warehouses

202 J. Pardillo, J.-N. Mazón / Computer Standards & Interfaces 34 (2012) 189–202

T

(PAC08-0157-0668) from the Castilla-La ManchaMinistry of Educationand Science. Thanks also to Juan Trujillo for his comments on the draft.

References

[1] A. Abelló, J. Samos, F. Saltor, YAM2: a multidimensional conceptual model extend-ing UML, Information Systems 31 (6) (2006) 541–567.

[2] S. Chaudhuri, U. Dayal, An overview of data warehousing and OLAP technology,SIGMOD Record 26 (1) (1997) 65–74.

[3] W.A. Giovinazzo, Object-Oriented Data Warehouse Design: Building A Star Schema,Prentice Hall, 2000.

[4] M. Golfarelli, D. Maio, S. Rizzi, The dimensional fact model: a conceptual modelfor data warehouses, International Journal Cooperative Information System 7(2–3) (1998) 215–247.

[5] K. Hahn, C. Sapia, M. Blaschka, Automatically Generating OLAP Schemata fromConceptual Graphical Models, DOLAP, 2000, pp. 9–16.

[6] B. Hüsemann, J. Lechtenbörger, G. Vossen, Conceptual Data Warehouse Modeling,DMDW, 2000, p. 6.

[7] W.H. Inmon, Building the Data Warehouse, Wiley, 2005.[8] M. Jarke, M. Lenzerini, Y. Vassiliou, P. Vassiliadis, Fundamentals of Data Ware-

houses, Springer, 2000.[9] R. Kimball, M. Ross, The Data Warehouse Toolkit: The Complete Guide to Dimen-

sional Modeling, Wiley, 2002.[10] S. Luján-Mora, J. Trujillo, I.-Y. Song, A UML profile for multidimensional modeling

in data warehouses, Data & Knowledge Engineering 59 (3) (2006) 725–769.[11] J.-N. Mazón, A Model-Driven Approach for the Multidimensional Design of Data

Warehouses. PhD thesis, University of Alicante, 2008.[12] J.-N. Mazón, J. Pardillo, J. Trujillo, A Model-Driven Goal-Oriented Requirement En-

gineering Approach for Data Warehouses, ER Workshops, 2007, pp. 255–264.[13] J.-N. Mazón, J. Trujillo, An MDA approach for the development of data ware-

houses, Decision Support Systems 45 (1) (2008) 41–58.[14] Object Management Group (OMG), Catalogue of Specifications, http://www.omg.

org/technology/documents/spec_catalog.htm, April, 2009.[15] J. Pardillo, J.-N. Mazón, J. Trujillo, Model-Driven Metadata for OLAP Cubes from

the Conceptual Modelling of Data Warehouses, DaWaK, 2008, pp. 13–22.[16] T.B. Pedersen, How is BI Used in Industry?: Report from a Knowledge Exchange

Network, DaWaK, 2004, pp. 179–188.[17] N. Prat, J. Akoka, I. Comyn-Wattiau, A UML-based data warehouse design method,

Decision Support Systems 42 (3) (2006) 1449–1473.[18] S. Rizzi, A. Abelló, J. Lechtenbörger, J. Trujillo, Research in DataWarehouse Model-

ing and Design: Dead or Alive? DOLAP, 2006, pp. 3–10.

RETRA

[19] A. Sen, A. Sinha, Toward developing data warehousing process standards: anontology-based review of existing methodologies, IEEE Transactions on Systems,Man, and Cybernetics, Part C: Applications and Reviews 37 (1) (2007) 17–31.

[20] D. Tsichritzis, A.C. Klug, The ANSI/X3/SPARC DBMS framework report of the studygroup on database management systems, Information Systems 3 (3) (1978)173–191.

[21] J. Zubcoff, J. Pardillo, J. Trujillo, A UML profile for the conceptual modelling ofdata-mining with time-series in data warehouses, Information and SoftwareTechnology 51 (6) (2009) 977–992.

C

Jesús Pardillo graduated with honours in Computer Sciencein 2006 and obtained his Ph.D. in 2010 from the Universityof Alicante (Spain). He has been involved in applied researchin the fields of conceptual modelling, software visualisation,and code generation. His current research interests revolvearound the foundations of mathematics and the applicationof formal methods to software development. You can followhim at www.jesuspardillo.com.

Jose-Norberto Mazón obtained his Ph.D. in Computer Sci-ence from the University of Alicante (Spain). He has pub-lished several papers about data warehouses in nationaland international workshops and conferences, (such asDAWAK, ER, DOLAP, BNCOD, JISBD, and so on) and in severaljournals such as Decision Support Systems (DSS) or Dataand Knowledge Engineering (DKE). His research interestsare: business intelligence, design of data warehouses, multi-dimensional databases, and model driven development.

ED