“the foundations of successful reference data · pdf file“the foundations of...
TRANSCRIPT
TopQuadrant Webcast with Malcolm Chisholm
March 18, 2015
“The Foundations of Successful
Reference Data Management”
Today’s Program
Introduction of Agenda and Speakers
I. Foundations of Successful Ref. Data Management What is reference data and why is it important? Challenges of reference data management What are some best practices for governance and management What capabilities should you look for in a reference data solution?
II. IntroducingTopQuadrant’s new offering for RDM TopBraid Reference Data Manager Designed to address core capabilities identified by Malcolm
to support a modern RDM solution Short walkthrough of tasks that support the life-cycle
of a reference dataset
III. Questions and Answers
Robert Coyne
Malcolm Chisholm
Bob DuCharme
Slide 2© Copyright 2015 TopQuadrant Inc
© Copyright 2015 TopQuadrant Inc. Slide 3
TopQuadrant Company
Focus:
TopQuadrant was founded in 2001
Our focus is to harness emerging technology to build practical but innovative business applications.
Foundation:
We continue our strong commitment to standards-based approaches to data semantics
Our Mission:
Empower people—by making enterprise information meaningful
© Copyright 2015 TopQuadrant Inc. Slide 4
TopBraid Solutions
TopBraid Reference Data Manager™ supports the governance
and provisioning of reference data including the curation of
reference datasets (code-lists) with comprehensive metadata.
TopBraid Enterprise Vocabulary Net™ supports collaborative
management of enterprise metadata, business glossaries and
taxonomies used in search, content navigation and data integration.
TopBraid Insight™ is a semantic virtual data warehouse that
enables federated querying of data across diverse data sources
as if they were in one place.
Metadata
Reference Data
Transaction Structure Data
Enterprise Structure Data
Transaction Activity Data
Transaction Audit Data
Reference Data in Context
Increasing:
• Volume of Data
• Population Later in Time
• Shorter Life Span
Increasing:
• Per Value Data Quality Importance
• Semantic Content
Most Relevant
To Design
Most Relevant
To Outside World
Most Relevant
To Business
Most Relevant
To Technology
© AskGet.com Inc., 2015. All rights reserved Slide 5
Layers of Data
Metadata
The data that describes all aspects of an enterprise’s information assets,
and enables the enterprise to effectively use and manage these assets.
Here it is confined to the structure of databases. Found in a database’s system catalog.
Sometimes included in database tables.
Reference Data
Any kind of data that is used solely to categorize other data found in a
database, or solely for relating data in a database to information beyond
the boundaries of the enterprise.
Codes and descriptions. Tables containing this data usually have just a few rows and columns.
Transaction
Structure Data
Data that represents the direct participants in a transaction, and which
must be present before a transaction fires.
The parties to the transactions of the enterprise. E.g. Customer, Product.
Enterprise
Structure Data
Data that permits business activity to be reported and/or analyzed by
business responsibility.
Typically, data that describes the structure of the enterprise. E.g. organizational or financial
structure.
Transaction
Activity Data
Data that represents the operations an enterprise carries out
Traditional focus of IT – in many enterprises the only focus.
Transaction
Audit Data
Data that tracks the life cycle of individual transactions.
Includes application logs, database logs, web server logs.
© AskGet.com Inc., 2015. All rights reserved Slide 6
Importance of Reference Data
© AskGet.com Inc., 2015. All rights reserved
• Data Quality problems have widespread impact
• Lack of understanding leads to bad business decisions
• The same table occurs in many different applications
• 20-50% of the tables in a database are Reference Data
• Unnecessary and difficult mappings for data integration
• Need it to understand the world outside the enterprise
• Need it to turn data into business information
Slide 7
Central Reference Data Management
Unit
© AskGet.com Inc., 2015. All rights reserved
Why Specialized?
Reference Data has unique properties
E.g. It has meaning, and is added to production environments
Reference Data has unique challenges
E.g. It has to be synchronized across many applications
Reference Data has unique risks
E.g. It is often misunderstood leading to “miscodings” etc.
Why Centralized?
Need for standardization
E.g. Which Country Code will be use –GENC, ISO Alpha-2, ISO Alpha-3…
Need for one place in enterprise to deal with external authorities
E.g. Who ensures we get the NAICS updates
Need to set up governance for internal reference data mgmt.
E.g. How are Customer Type, Product Line managed?
There are a number of reasons why enterprises should set up a central unit for RDM
Slide 8
Governing and Managing External Reference Data
© AskGet.com Inc., 2015. All rights reserved
External World
• There are many tasks that a central Reference Data Unit (RDU) must perform for External Reference Data.
• Some of these tasks could be performed outside the RDU but all must be governed by the RDU.
Authority
Ref Data StandardRef Data Standard
Ref Data StandardRef Data Standard
Authority
Ref Data StandardRef Data Standard
Ref Data StandardRef Data Standard
Enterprise
Central Reference Data Unit
• Discovery
• Authority Profile
• Source Data Profile
• Dataset Onboarding
• SubscriptionManagement
• PeriodicReconciliation
Slide 9
Governing and Managing Internal Reference Data
© AskGet.com Inc., 2015. All rights reserved
• Typically Internal Reference Data tables are managed poorly and have no governance
• Governance is needed to assign accountabilities and enforce standard processes that drive up quality
Enterprise
Central Reference Data Unit
Reference Data Table
Internal Reference Data Producers
Reference Data Table
Internal Reference Data Producers
Reference Data Table
Internal Reference Data Producers
Reference Data Table
Internal Reference Data Producers
Reference Data Table
Internal Reference Data Producers
Reference Data Table
Internal Reference Data Producers
GOVERNANCE
MANAGEMENT
Slide 10
Governing Reference Data in Operational Environments
© AskGet.com Inc., 2015. All rights reserved
• Producers of Internal Reference Data may be well governed, but both Internal and External Reference Data can be misunderstood, misused, and abused in operational environments
• This impacts downstream use, data integrity across the enterprise
• Governance is required
Enterprise
Reference Data Table
Internal Reference Data Producers
Accts Recvbl Order Entry Sales Data Warehouse Treasury
Why isn’t Corporate
Customer in this table?I think Goldman Sachs
is a Retail Bank…
I’ll use the code for
Asset Manager to book
Private Equity
Hmm…no code for
Hedge Fund – I’ll put
one in
Central Reference Data Unit
OPERATIONSExample: Customer Type
Govern
Govern
Slide 11
Governing Reference Data in Operational Environments
© AskGet.com Inc., 2015. All rights reserved
• Producers of Internal Reference Data may be well governed, but both Internal and External Reference Data can be misunderstood, misused, and abused in operational environments
• This impacts downstream use, data integrity across the enterprise
• Governance is required
Enterprise
Reference Data Table
Internal Reference Data Producers
Accts Recvbl Order Entry Sales Data Warehouse Treasury
Why isn’t Corporate
Customer in this table?I think Goldman Sachs
is a Retail Bank…
I’ll use the code for
Asset Manager to book
Private Equity
Hmm…no code for
Hedge Fund – I’ll put
one in
Central Reference Data Unit
OPERATIONSExample: Customer Type
Govern
Govern
Slide 12
REF DATA HUB
Cleanse
Integrate
Publish
App 1
App 2
App 3
App ...
TRANSACTION
APPLICATIONS
App A App B App ...
OTHER CONSUMING APPLICATIONS
Produce
Master
Data
Other Functionality
Steward
Traditional Reference Data Hub Approach
© AskGet.com Inc., 2015. All rights reserved
• This Hub is for both the production and distribution of Reference Data
• Data Stewards typically produce some data in the Hub
• The Hub may source some (usually most) Reference Data from legacy applications (typically transaction applications)
Slide 13
Farm And Market Approach to Reference Data
REF. DATA HUB
(“MARKET”)
Publish
App 1
App 2
App 3
App ...
TRANSACTION
APPLICATIONS
App A App B App ...
OTHER CONSUMING APPLICATIONS
Other Functionality
Ref. Data
Entity 1 App
Ref. Data
Entity 2 App
Ref. Data
Entity 3 App
Ref. Data
Entity N App
REFERENCE DATA PRODUCTION (“FARMS”)
Reduced Cleansing
Reduced Integration
© AskGet.com Inc., 2015. All rights reserved
• Production of Reference Data is done in specialized environments.
• Only production-ready Reference Data is placed in the Hub.
• All other environments subscribe to Reference Data from the Hub.
Slide 14
Summary and Capabilities to Consider in Solutions
© AskGet.com Inc., 2015. All rights reserved
• There are other capabilities to consider in Reference Data solutions, but these are fundamental.
• Profile an External Authority
• Profile an External Reference Dataset
• Support Semantic Analysis of each Element in Reference Dataset
• Document Semantic Analysis
• Import Reference Data into a Repository
• Assign Accountabilities for RDM Tasks
• Track Changes to Reference Data
• Support Distribution of Reference Data
Slide 15
Summary and Capabilities to Consider in Solutions
© AskGet.com Inc., 2015. All rights reserved
• There are other capabilities to consider in Reference Data solutions, but these are fundamental.
• Profile an External Authority
• Profile an External Reference Dataset
• Support Semantic Analysis of each Element in Reference Dataset
• Document Semantic Analysis
• Import Reference Data into a Repository
• Assign Accountabilities for RDM Tasks
• Track Changes to Reference Data
• Support Distribution of Reference Data
Slide 16
A flexible web-based solution for governing and provisioning reference data in the enterprise:
– Governance
– Provisioning
– Comprehensive metadata
– Enrichment
TopBraid Reference Data Manager™ (TopBraid RDM) makes it easy to bring consistency and accuracy to reference data management and use.
Slide 17© Copyright 2015 TopQuadrant Inc
Making Reference Data Meaningful
© Copyright 2015 TopQuadrant Inc Slide 18
TopBraid RDM enables more meaningful and effective use of reference data by capturing and managing semantic metadata about reference data and also about reference datasets
Who Benefits from RDM?
TopBraid RDM has a variety of users including:• Data stewards whose primary responsibility is governance of reference
data
• Subject matter experts who contribute to identification and development of reference data and advise the data stewards
• Data managers who consult RDM to find reference data suitable for their applications
• Application administrators and database administrators who create and use RDM exports in order to load reference data into the systems they are responsible for maintaining
• Business analysts who want to better understand the meaning of reference data as they design business reports
• End users of reference data who want to make sure that they are using correct codes and contribute knowledge that may enrich the body of information managed by RDM
Slide 19© Copyright 2015 TopQuadrant Inc
Today’s Example
Data Steward: dataset governance
• Creating a Reference Dataset
• Importing Reference Data
• Maintaining Reference Dataset Metadata
• Modifying Reference Data
• Exporting Reference Data
• Extending Business Concept Model (Ontology) to support additional properties
(Role = Manager)
© Copyright 2015 TopQuadrant Inc Slide 20
Collaboration features
• Change management, versioning and working copies
• Task assignment
• Security and permissions
• RACI support: Responsible, Accountable, Consulted, Informed
Slide 22© Copyright 2015 TopQuadrant Inc
Additional features
• Data quality and validation rules
• Hierarchy management
• Crosswalks
• APIs, integration and customization
Slide 24© Copyright 2015 TopQuadrant Inc
• Model driven flexibility for present and future needs
• Empowers data stewardship – easy maintenance –minimal IT involvement
• User friendly web-based UI
• Metadata capabilities
• Easy customization
• Governance
For more information: [email protected]
© Copyright 2015 TopQuadrant Inc Slide 25
Questions? Want to Learn More?