the metadata layer asroni ver. 01...
TRANSCRIPT
Part IV Business Intelligence Applications 345
Applications In This Part Chapter 12: The Metadata Layer
Chapter 13: Using the Pentaho Reporting Tools
Chapter 14: Scheduling, Subscription, and Bursting
Chapter 15: OLAP Solutions Using Pentaho Analysis Services
Chapter 16: Data Mining with Weka
Chapter 17: Building Dashboards
Chapter 12 The Metadata Layer 347
A. Metadata Overview
◦ What Is Metadata? 347
◦ The Advantages of the Metadata Layer 348
Using Metadata to Make a More User-Friendly Interface 348
Adding Flexibility and Schema Independence 348
Refining Access Privileges 349
Handling Localization 349
Enforcing Consistent Formatting and Behavior 350
◦ Scope and Usage of the Metadata Layer 350
B. Pentaho Metadata Features 352
◦ Database and Query Abstraction 352
Report Definition: A Business User’s Point of View 352
Report Implementation: A SQL Developer’s Point of View 353
Mechanics of Abstraction: The Metadata Layer
◦ Properties, Concepts, and Inheritance in the Metadata Layer 355
Properties 355
Concepts 356
Inheritance 356
Localization of Properties 357
Chapter 12 The Metadata Layer 347
(Cont.)
C. Creation and Maintenance of Metadata 357
◦ The Pentaho Metadata Editor 357
◦ The Metadata Repository 358
◦ Metadata Domains 359
◦ The Sublayers of the Metadata Layer 359
The Physical Layer 359
The Logical Layer 362
The Delivery Layer 365
◦ Deploying and Using Metadata 366
Exporting and Importing XMI files 366
Publishing the Metadata to the Server 367
Refreshing the Metadata 367
D. Summary
The Metadata Layer
Many of the topics related to business intelligence, such as data integration and data warehousing, can be understood as solutions to problems concerning abstraction, accessibility, and delivery of data.
In previous chapters, you learned that the data warehouse provides a substantial deal of abstraction from the raw data accumulated in various data sources.
Although establishing a data warehouse solves some of the data abstraction and accessibility issues, it is still not ideal for delivering data to reporting tools.
A. Metadata Overview
In this first section, we explain briefly
what kinds of things we are talking about
when we use the term ‘‘metadata,’’ and
what problems it solves.
Later in this chapter, we take a closer
look at using Pentaho metadata.
What Is Metadata?
The term metadata is a bit overused. In a
general sense, it means ‘‘data about data.’’
Depending upon the context, there are a lot
of different things to say ‘‘about’’ data, and
technically this all qualifies as metadata.
For example, most RDBMSes support listing
all available databases and schema objects.
This is a typical example of metadata as it
describes the available types and forms of
data stored in the database.
The Advantages of the Metadata Layer
As mentioned earlier, the data warehouse
does not solve all problems in delivering
the data to reporting tools.
In this section, we take a closer look at
these unresolved problems and show
how a metadata layer can help to solve
them.
Using Metadata to Make a More
User-Friendly Interface From the standpoint of reporting and
visualization tools, the data warehouse is ‘‘just’’ a relational database.
Using it still requires considerable knowledge of and experience with the database query language (which is usually some dialect of the Structured Query Language, SQL).
In most cases, this causes report design to be out of reach of the typical business user.
The Pentaho metadata layer can alleviate this problem to some extent.
Scope and Usage of the Metadata
Layer
The following list offers a brief overview of how Pentaho uses the metadata layer in practice. These points are illustrated in Figure 12-1.
Metadata input from the database, as well as user-defined metadata, is defined using the Pentaho Metadata Editor (PME) and stored in the metadata repository.
Metadata can be exported from the repository and stored in the form of .xmi files, or in a database. The metadata is associated with a Pentaho solution on the Pentaho server, where it can be used as a resource for metadata-based reporting services.
Scope and Usage of the Metadata
Layer (Continue) Using the Pentaho report design tools, end users can create
reports on the metadata. This allows reports to be built without knowledge of the physical details of the underlying database, and without any knowledge of SQL. Instead, the report contains a high-level specification of the query result, which is defined using a graphical user interface.
When running reports based on Pentaho metadata, the reporting engine interprets the report. Query specifications are stored in the report in a format called Metadata Query Language (MQL), which is resolved against the metadata. At this point, the corresponding SQL is generated and sent to the database. Beyond this point, report processing is quite similar to ‘‘normal’’ SQL-based reporting. The database responds to the query by sending a data result, which is rendered as report output.
Figure 1: High-level overview of the scope and usage of Pentaho
Metadata
B. Pentaho Metadata Features
In this section, we briefly describe the key
features of the Pentaho metadata layer.
B.1. Database and Query
Abstraction The Pentaho metadata layer can contain many distinct types of structural components, and it is easy to lose track of the big picture. Therefore, we will first examine the metadata layer at a high level before diving into the details.
Report Definition: A Business User’s Point of View
◦ Consider the requirements of a typical business user—say, the manager of Sales and Rentals at World Class Movies.
Report Implementation: A SQL Developer’s Point of View
◦ Suppose you want to retrieve the same report data directly from the World Class Movies data warehouse using SQL. If you have the SQL skills, it certainly isn’t hard (although it may be somewhat tedious). Even so, this section walks you through the process step by step to illustrate a few concepts about Pentaho metadata.
Figure 2: Deriving report items from joined
tables
Listing 1: The SQL statement to retrieve the number of
orders, grouped by website title and month
B.1. Database and Query
Abstractionb(Cont.) Mechanics of Abstraction: The Metadata
Layer ◦ It is likely that the steps you just went through
are beyond the technical skills of most business users, and certainly beyond their job descriptions.
◦ But what if the details of joining the tables had been taken care of in advance ?
◦ What if you present users with just one set of items from which they can pick whatever they happen to find interesting?
◦ What if the customer_order_id item was designed to directly represent the COUNT DISTINCT operation?
B.2 Properties, Concepts, and Inheritance in the
Metadata Layer
In this section, we discuss concepts and properties, which are fundamental building
blocks of the Pentaho Metadata Layer. In addition, we describe how concepts can
inherit properties from one another.
◦ Properties 355
Objects in the metadata layer can have a number of properties.
◦ Concepts 356
In the context of Pentaho metadata, a concept is a collection of properties
that can be applied as a whole to a metadata object.
◦ Inheritance 356
Properties can be managed using a feature called inheritance.
◦ Localization of Properties 357
General properties such as name and description can be localized so they
can be displayed in multiple languages.
C. Creation and Maintenance of Metadata 357
◦ The Pentaho Metadata Editor 357
◦ The Metadata Repository 358
◦ Metadata Domains 359
◦ The Sublayers of the Metadata Layer 359
The Physical Layer 359
The Logical Layer 362
The Delivery Layer 365
◦ Deploying and Using Metadata 366
Exporting and Importing XMI files 366
Publishing the Metadata to the Server 367
Refreshing the Metadata 367
D. Summary
C. Creation and Maintenance of
Metadata 357
This section briefly explains the components that make
up the metadata layer as well as the relationships that
connect them. In the remainder of this chapter, we
describe these components in more detail, and explain
how to create them using the Pentaho Metadata Editor.
◦ The Pentaho Metadata Editor 357
C.1 The Pentaho Metadata Editor 357
Pentaho offers the Pentaho Metadata Editor
to create and edit metadata. You can
download this tool from the Pentaho
project page at sourceforge.net.
C.2 The Metadata Repository 358
Pentaho metadata is stored in its own repository, which is distinct from both the Pentaho solution repository and the Pentaho data integration repository. Currently, the Pentaho Metadata Editor is the only application that is intended to edit the contents of the metadata repository.
By default, the PME uses binary files for storing metadata. These files, called mdr.btx and mdr.btd, are found in the home directory of the metadata editor.
C.3 Metadata Domains 359
The Pentaho metadata layer as a whole is organized into one or more metadata domains. A metadata domain is a container for a collection of metadata objects that can be used together as a source of metadata for one Pentaho solution. (In this context, we use the term ‘‘Pentaho solution’’ as defined in Chapter 4: a collection of resources such as reports and action sequences that reside in a single folder in the pentaho-solutions directory.)
C.4 The Sublayers of the Metadata Layer 359
The following sections describe the components of the physical, logical, and delivery layers that are included within the metadata layer.
The Physical Layer ◦ The objects that reside in the physical layer of a metadata
domain are descriptors that correspond more or less one-to-one with database objects.
The Logical Layer ◦ The logical layer literally sits between the physical layer
and the presentation layer.
The Delivery Layer ◦ The Delivery Layer contains the metadata objects that are
visible to the end user, such as Business Views and Business Categories.
D. Deploying and Using
Metadata After creating the Business Model(s), you
must deploy the data layer before you can
use it to create reports. In this section, we
describe how to publish the metadata. In
the next chapter, you will learn how you can
actually build reports on an already
deployed metadata layer.
D.1 Exporting and Importing
XMI files You can build reports on metadata using the
metadata data source. This is explained in detail in Chapter 13. To create a report based on metadata, you must tell the Report Designer where the metadata is.
The Report Designer consumes metadata in the XML Metadata Interchange (XMI) format. To create an XMI file for your metadata, use the main menu and choose File Export to XMI File. Similarly you can use File Import from XMI File option to load the metadata layer with existing metadata.
D.2 Publishing the Metadata to the Server
If reports are to be run on the server, the metadata must be available to the server. Metadata is stored on the server side as XMI files. You can have one XMI file per Pentaho solution. This file must be called metadata.xmi.
You can simply export metadata to an XMI file and then simply copy the XMI file to the appropriate solution directory on the server. However, for a production server, it is not likely that every BI developer has direct access to the server’s file system. Therefore, the Pentaho BI server provides a service that allows you to publish metadata from the metadata editor.
You can publish metadata to the Pentaho BI server from the Main menu by choosing File Publish to server. This pops up the Publish To Server dialog, shown in Figure 7.
Figure 7: The Publish To Server dialog
D.3 Refreshing the Metadata
After publishing or copying the XMI file
to the server, you must tell the server to
reload the metadata. This can be done
from the user console through the menu
by choosing Tools Refresh Reporting
Metadata as shown in Figure 12-8.
Figure 8: Refreshing the metadata with the user console
Alternatively you can refresh the
metadata using the Server
Administration Console. To refresh
the metadata from the Server
administration Console, press the
Metadata Models button in the
Refresh BI Server panel in the
Administration tab page, shown in
Figure 9.
Figure 9: Refreshing the metadata with the
Server Administration Console
Summary
This chapter introduced the Pentaho metadata layer.
The Pentaho metadata layer allows you to present your database or data warehouse in a way that is more understandable to business users.
This allows them to make reports without directly writing SQL.
The following chapter describes how you can actually use the metadata layer to build reports.