april 2004 - irm roundtable - part 2 - metadata

41
1 Information Resource Information Resource Management (IRM) Management (IRM) Round Table Round Table April 7, 2004 April 7, 2004 Hosted by:

Upload: aamir97

Post on 27-Jan-2015

110 views

Category:

Education


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: April 2004 - IRM Roundtable - Part 2 - Metadata

1

Information Information Resource Resource

Management (IRM)Management (IRM)Round TableRound Table

April 7, 2004April 7, 2004

Hosted by:

Page 2: April 2004 - IRM Roundtable - Part 2 - Metadata

2

AgendaAgenda

Introductions Introductions & Agenda& Agenda 9:00 – 9:15 9:00 – 9:15 Metadata ManagementMetadata Management 9:15 – 10:00 9:15 – 10:00

Gregg WyantGregg Wyant

BreakBreak 10:00 - 10:15 10:00 - 10:15 Metadata ManagementMetadata Management 10:15 – 10:15 –

11:0011:00Juanita MercadoJuanita Mercado

Metadata – Metadata – The Promise vs RealityThe Promise vs Reality 11:00 – 11:00 – 11:4511:45

Cass SquireCass Squire

Wrap Up Wrap Up 11:45 –12:00 11:45 –12:00

Page 3: April 2004 - IRM Roundtable - Part 2 - Metadata

3

Metadata Metadata ManagementManagement

Gregg WyantGregg Wyant

Chief Data ArchitectChief Data ArchitectIntelIntel

Page 4: April 2004 - IRM Roundtable - Part 2 - Metadata

4

Metadata ManagementMetadata ManagementStrategy / Approach:Strategy / Approach:

Leverage progress of TQdM programLeverage progress of TQdM program The TQdM program at Intel has made data issues visibleThe TQdM program at Intel has made data issues visible Metadata management needs to intersect the positive aspects Metadata management needs to intersect the positive aspects

of TQdMof TQdM “ “Repeatable process, standard deliverables”Repeatable process, standard deliverables”

Metadata cannot be managed without consistent processes and Metadata cannot be managed without consistent processes and fixed deliverablesfixed deliverables

Progress To Date:Progress To Date: Metadata program established and fundedMetadata program established and funded Focus on data-oriented metadata managementFocus on data-oriented metadata management First release of Enterprise Metadata Repository completed First release of Enterprise Metadata Repository completed

Issues / Roadblocks:Issues / Roadblocks: Culture rewards solving each problem anewCulture rewards solving each problem anew

This is slowly changing; pushing Reuse Awards to recognize This is slowly changing; pushing Reuse Awards to recognize desired behaviordesired behavior

Creating metrics which can be used as carrot and stickCreating metrics which can be used as carrot and stick Consensus management approachConsensus management approach

Obtaining commitment to a proposed change is extremely Obtaining commitment to a proposed change is extremely time-consumingtime-consuming

Page 5: April 2004 - IRM Roundtable - Part 2 - Metadata

5

Metadata is the Enabler Metadata is the Enabler for Reusefor Reuse Metadata must be managed to be of any valueMetadata must be managed to be of any value

• A repeatable process to identify and define metadataA repeatable process to identify and define metadata• Standard deliverables in which to capture metadataStandard deliverables in which to capture metadata• Repeatable governance process to provide metadata credibilityRepeatable governance process to provide metadata credibility

Enterprise Architecture Framework for data explicitly Enterprise Architecture Framework for data explicitly defineddefined

Deliverables defined for conceptual, logical, and physical data modelsDeliverables defined for conceptual, logical, and physical data models

Reuse is Tracked by each ProjectReuse is Tracked by each Project

Data Analyst Community trained on deliverables, Data Analyst Community trained on deliverables, processes, and criticality of metadata managementprocesses, and criticality of metadata management

Enterprise Metadata Repository (EMR) Enterprise Metadata Repository (EMR) Drives connections between process, data, apps, and techDrives connections between process, data, apps, and tech Implemented in phases to connect various metadata Implemented in phases to connect various metadata

Page 6: April 2004 - IRM Roundtable - Part 2 - Metadata

6

Enabling MetadataEnabling Metadata

Page 7: April 2004 - IRM Roundtable - Part 2 - Metadata

7

Browse Metadata/Run a Browse Metadata/Run a ReportReport

Page 8: April 2004 - IRM Roundtable - Part 2 - Metadata

8

Phase 2Operational Metadata

Phase 3Strategic Metadata

Phase 4Enterprise-wide IntegrationAnd TQdM Platform

Phase 1Tracking and Managing Metadata

Q4 Q3Q1 Q2 Q4

Single metadata repository supporting viewers (portals) for multiple audiences, EAM Integration

Metadata Program Major Phases

Tracking and managing metadata repository; search, presentation, change management, and metadata exchange between certified Data modeling tool.

Integrate legacy metadata into EMR

Data monitoring, alerts, health indicators, DPM trending. (Integration with Reporting Tool )

Business Process metadata support. (certified Process modeling tool exchange)

Phase 8Enterprise-wide IntegrationAnd TQdM Platform

Phase 5Unstructured Metadata

Phase 6CMMI Integrations

Phase 7Legacy Migration

2006

Standardized physical database design processes & governance for all environments (OLTP, DSS, XML, EAI, etc).

Ongoing deliverable and repository alignment

Ongoing deliverable and repository alignment

2005

Page 9: April 2004 - IRM Roundtable - Part 2 - Metadata

9

Metadata, Standards and Metadata, Standards and Reuse Reuse

Critical Business Process models

iPAG Standards

Data models

Process Tool Data Tool

Data Architecture

Review

Enterprise Metadata Repository

iDAG Standards

Certified Deliverables

Certified Data Objects

Certified Processes based on Certified Data Objects

Centralized Data Model Repository

Centralized Process Model

Repository

Page 10: April 2004 - IRM Roundtable - Part 2 - Metadata

10

EAF, Metadata and the Zachman EAF, Metadata and the Zachman FrameworkFramework

Data

Process

Metadata

Portions of this page include Copyrighted material. See full disclaimer in backup.

Page 11: April 2004 - IRM Roundtable - Part 2 - Metadata

11

• Convergence of DQ, Metadata, BAM and PAFConvergence of DQ, Metadata, BAM and PAF• Clearly defined architecture (boundaries between) for Clearly defined architecture (boundaries between) for

Metadata and the 3 major areas DQ, BAM and PAFMetadata and the 3 major areas DQ, BAM and PAF• Track operational metadata, its monitors and alertsTrack operational metadata, its monitors and alerts

• Monitor the number of ROOs/RORs tied to Monitor the number of ROOs/RORs tied to applicationsapplications

• Metadata-driven capability to enable capture Metadata-driven capability to enable capture and notification of ROO/ROR defects and data and notification of ROO/ROR defects and data movement excursions movement excursions

• Monitor ReportingMonitor Reporting• Real time reporting done from the monitoring toolReal time reporting done from the monitoring tool• Analytics reporting is done from EDWAnalytics reporting is done from EDW

• Feed DQ Corporate ScorecardFeed DQ Corporate Scorecard• Single governance processSingle governance process• Track data quality issues to the data element levelTrack data quality issues to the data element level• Track DQ issues, measure effectiveness and drive improvementsTrack DQ issues, measure effectiveness and drive improvements

Metadata Architecture Metadata Architecture Framework VisionFramework Vision

Page 12: April 2004 - IRM Roundtable - Part 2 - Metadata

12

Information Resource Information Resource Round TableRound TableMetadata ManagementMetadata Management

Presented by: Juanita M. Mercado

Lead Data Architect

2004-April

Page 13: April 2004 - IRM Roundtable - Part 2 - Metadata

13

Types of Metadata

Definition

‒ Formal specification about ways to accurately and unambiguously describe information elements that are critical to the business

‒ Standard definitions for meaning, acceptable content and relationships

System Integration

‒ Formal specification on how to map equivalent data structures to support a federated operating model

‒ Supports a uniform way of accessing various data formats such as flat files and databases

Application Runtime

‒ Formal specification describing parameters for running an application executable. Includes dependency checks and runtime statistics.

Infrastructure

‒ Formal specification for describing the system environment as far as networks, firewall, hardware and software, workstations, servers, etc.

Page 14: April 2004 - IRM Roundtable - Part 2 - Metadata

14

Roles and Functions

Enterprise Data Architect

‒ Keeper of the formal specifications

Data Steward

‒ Keeper of the business content

Application Data Architect

‒ Applies the formal specifications and related business content to meet particular processing requirements

Page 15: April 2004 - IRM Roundtable - Part 2 - Metadata

15

Visa Global Business Elements (VGBE) Definition Metadata

What it is

‒ Formal specification about ways to accurately and unambiguously describe of information elements that are critical to VisaNet interoperability

‒ Standard definitions for meaning, acceptable content and relationships

What it accomplishes

‒ Share metadata consistently across Visa and with IT partners

‒ Assures consistent characteristics and behavior for all implementations

‒ Extensible and flexible to support evolving business strategies

How it works

‒ A structurally stable yet dynamic document (UML) that integrates into development environments

‒ Automates inheritance of definitions, behaviors and relationships directly into application codes

Page 16: April 2004 - IRM Roundtable - Part 2 - Metadata

16

The Role of Definition Metadata

DATA MANAGEMENT

DefinitionMetadata

DataArchitecture

Data Quality Data

Data Quality is a measure of how the data is tracking to the definitions established by the Data Architecture.

Automation of this measurement is possible with metadata that is constructed using a formal specification.

Definitions are also kept consistent when a specification is used.

Page 17: April 2004 - IRM Roundtable - Part 2 - Metadata

17

ISO 11179

An international standard for the specification of data elements Framework for the specification and standardization of data elements Classification for data elements Basic attributes of data elements Rules and guidelines for the formulation of data definitions Naming and identification principles for data elements

Page 18: April 2004 - IRM Roundtable - Part 2 - Metadata

18

VGBE Specification

Identifying Attributes

Name

Identifier

Version

Synonymous Name

Abbreviated Name

Representation

ISO Field Number

XML Tag

MDR Attribute

Definition Attributes

Business Definition

Representational Attributes

Minimum Storage Length

Maximum Storage Length

List of Permissible Values

Numeric Precision

Numeric Scale

Administrative Attributes

Responsible Organization

Submitting Organization

Page 19: April 2004 - IRM Roundtable - Part 2 - Metadata

19

VGBE Specification

Business Entity Rule

Business Domain

Subject Area

Business Entity

Business Entity Attribute

Nullability

Default Value

Primary Key

ISO Field Number

Business Element Interaction Rule

Relationship Type

Related Business Element

Cardinality

Optionality

Roled Business Element Rule

Roled Name

Roled Definition

Fundamental Business Element

Identifier

Version

Roled Synonymous Name

Roled Abbreviated Name

Roled Context

Action Assertion Rule

Conditional Business Element

Influenced Business Element

Assertion Rule

Page 20: April 2004 - IRM Roundtable - Part 2 - Metadata

20

VGBE Registry

Z

VGBE_SPECN_ATTR

BUS_ELMT

BUS_ENTY

BUS_ENTY_ELMT

ORGN

SUBJ_AREASUBJ_AREA_ENTY

BUS_ELMT_ATTR

BUS_ENTY_ATTR

BUS_DOM

ROLED_BUS_ELMT

BUS_ELMT_INTACT_RULE

RPRSNT

BUS_ENTY_ELMT_ATTR

PHYS_TBL

PHYS_TBL_ELMT

PHYS_ELMT

BUS_ELMT_INTACT_ATTR

Page 21: April 2004 - IRM Roundtable - Part 2 - Metadata

21

VGBE Services Framework Use Case 1

VGBE REGISTRY

Use Case 1Subscribing Developers are notifiedabout changes to VGBE content

VGBE ContentBusiness Element

SpecificationBusiness RuleSpecification

Enterprise Data Architect :(New role that requiressupport from GSC)

Maintains andManages RegistryContent & VGBE

Specification

Product Manager/ - DataOwner (New process thtrequires support of PDC &GSC)

co

llab

ora

te o

n

de

fin

ing

ne

w

VG

BE

or

revis

ing

exis

tin

g V

GB

E

publishes

Page 22: April 2004 - IRM Roundtable - Part 2 - Metadata

22

VGBE Services Framework Use Case 2

VGBE REGISTRY

Use Case 2The Application Data Architect Uses VGBE Content

selectVGBEs using an

interface

run EXE

create a physical model1. to verify physical design2. to add data elements notrepresented as a VGBE

publish as an ERwin model

create a data objectto incorporate inapplication code

data object

VGBE Content

Business ElementSpecification

Business Rule Specification

callsVGBE UI

physical data model

Enterprise DataArchitect:Requires okayfrom GSC.

Maintains andManages RegistryContent & VGBE

Specification

Product Manager -DataOwner: Requires okay fromPDC & GSC.

colla

bo

rate

on

de

finin

g n

ew

VG

BE

or

revi

sin

g e

xist

ing

VG

BE

(O

pe

nT

ext

)

readsVGBE Registry

Page 23: April 2004 - IRM Roundtable - Part 2 - Metadata

23

Demo: Use Case 2

VGBE REGISTRY

Use Case 2The Application Data Architect Uses VGBE Content

selectVGBEs using an

interface

run EXE

create a physical model1. to verify physical design2. to add data elements notrepresented as a VGBE

publish as an ERwin model

create a data objectto incorporate inapplication code

data object

VGBE Content

Business ElementSpecification

Business Rule Specification

callsVGBE UI

physical data model

Enterprise DataArchitect:Requires okayfrom GSC.

Maintains andManages RegistryContent & VGBE

Specification

Product Manager -DataOwner: Requires okay fromPDC & GSC.

colla

bo

rate

on

de

finin

g n

ew

VG

BE

or

revi

sin

g e

xist

ing

VG

BE

(O

pe

nT

ext

)

readsVGBE Registry

Page 24: April 2004 - IRM Roundtable - Part 2 - Metadata

24

MetadataMetadataThe The Promise Versus The Promise Versus The

RealityReality

Cass SquireCass SquireAssociate PartnerAssociate Partner

IBM Business Consulting ServicesIBM Business Consulting Services(650) 520-7247(650) 520-7247

[email protected]@us.ibm.com

Page 25: April 2004 - IRM Roundtable - Part 2 - Metadata

25

TopicsTopics

The NeedThe Need A Strategy & Approach for Solving itA Strategy & Approach for Solving it Progress To DateProgress To Date The Real World: Issues and Road The Real World: Issues and Road

BlocksBlocks

Page 26: April 2004 - IRM Roundtable - Part 2 - Metadata

26

The NeedThe Need An entirely new set of data for a new kind of An entirely new set of data for a new kind of

analytics is being rolled out analytics is being rolled out The business community has not be properly The business community has not be properly

engagedengaged The business community needs to understand:The business community needs to understand:

What data is availableWhat data is available What it meansWhat it means What its source isWhat its source is What its currency isWhat its currency is Who to go to to ask questions about the data and its Who to go to to ask questions about the data and its

meaningmeaning A new tool for querying (Business Objects) is A new tool for querying (Business Objects) is

being rolled out as wellbeing rolled out as well

Page 27: April 2004 - IRM Roundtable - Part 2 - Metadata

27

Strategy & ApproachStrategy & Approach A clear need for a mechanism for capturing and sharing A clear need for a mechanism for capturing and sharing

metadata surfaces as essential to the successful roll out of metadata surfaces as essential to the successful roll out of the new analytical environmentthe new analytical environment

AbInitio is the corporate ETL tool – leverage its metadata AbInitio is the corporate ETL tool – leverage its metadata for technical metadatafor technical metadata

Determine the applicability of its repository – the Determine the applicability of its repository – the Enterprise Metadata Environment (EME) for serving as the Enterprise Metadata Environment (EME) for serving as the repository for all metadatarepository for all metadata

ERwin contains Business and Technical names, definitions, ERwin contains Business and Technical names, definitions, and allowable values – use it as the source for this metadataand allowable values – use it as the source for this metadata

Determine Data Stewards in the both the business and Determine Data Stewards in the both the business and technical arenas to be the go-to people for questionstechnical arenas to be the go-to people for questions

Evaluate other tools for applicabilityEvaluate other tools for applicability Integrate Operational Metadata and SOX auditabilityIntegrate Operational Metadata and SOX auditability KISSKISS

Page 28: April 2004 - IRM Roundtable - Part 2 - Metadata

28

In IBM’s Business Intelligence Reference In IBM’s Business Intelligence Reference Architecture, Metadata is one of the Architecture, Metadata is one of the

components that glues the whole process components that glues the whole process together.together.

Data SourcesData IntegrationAccess

Transport / Messaging

Hardware & Software Platforms

Collaboration

Data Mining

Modeling

Query & Reporting

Network Connectivity, Protocols & Access Middleware

Systems Management & Administration

Security and Data Privacy

Metadata

Extraction

Transformation

Load / Apply

Synchronization

InformationIntegrity

• Data Quality• Balance & Controls

Scorecard

Visualization

Embedded Analytics

Data Repositories

Operational Data Stores

Data Warehouses

Metadata

Staging Areas

Data Marts

Analytics

Web Browser

Portals

Devices

Web Services

Enterprise

Unstructured

Informational

External

Data flow and Workflow

Bu

sin

ess

Ap

plic

atio

ns

Page 29: April 2004 - IRM Roundtable - Part 2 - Metadata

29

Progress to DateProgress to Date AbInitio EME determined to be easily enough extensible to hold all metadata required – at AbInitio EME determined to be easily enough extensible to hold all metadata required – at

least for initial phasesleast for initial phases Their web-based reporting is deemed acceptable for technical users but not for business Their web-based reporting is deemed acceptable for technical users but not for business

usersusers The ODBC API into their flat-file based repository is new and performance is “not ready for The ODBC API into their flat-file based repository is new and performance is “not ready for

prime time” yetprime time” yet The decision was made to extract from the EME to relational tables (15 +/-)The decision was made to extract from the EME to relational tables (15 +/-) Business Objects (web version) as the user interface to the metadata since that’s what Business Objects (web version) as the user interface to the metadata since that’s what

users will use to see the data – single universe and a dozen or so queriesusers will use to see the data – single universe and a dozen or so queries Extracts from the EME into the relational tables have been built.Extracts from the EME into the relational tables have been built. Data lineage simplified to ultimate source to ultimate target (intermediary ETL steps Data lineage simplified to ultimate source to ultimate target (intermediary ETL steps

hidden) for business usershidden) for business users Processes and extracts built for getting data from ERwin into the repositoryProcesses and extracts built for getting data from ERwin into the repository Processes for ensuring data analysts on all projects use the same tools/processes for Processes for ensuring data analysts on all projects use the same tools/processes for

capturing and publishing metadata for the repositorycapturing and publishing metadata for the repository Identified Data Stewards and created a cross reference between them and the entitiesIdentified Data Stewards and created a cross reference between them and the entities Business users have applauded it as very useful in helping them understand and use the Business users have applauded it as very useful in helping them understand and use the

new data for analyticsnew data for analytics Common processes for capturing Operational Metadata (Statistics, Error Reporting, Common processes for capturing Operational Metadata (Statistics, Error Reporting,

Auditing) built and used by every ETL processAuditing) built and used by every ETL process Unicorn evaluated as a possible tool – received high praises especially for its ability to Unicorn evaluated as a possible tool – received high praises especially for its ability to

speed up the mapping process - but the determination was made to postpone further speed up the mapping process - but the determination was made to postpone further testing of it until the above environment is in production for awhiletesting of it until the above environment is in production for awhile

Page 30: April 2004 - IRM Roundtable - Part 2 - Metadata

30

The Real World: Issues The Real World: Issues and Road Blocksand Road Blocks

Lots of hype about metadata – little in the way of tools Lots of hype about metadata – little in the way of tools to deliverto deliver

There are lots and lots of type of metadata – picking the There are lots and lots of type of metadata – picking the right subset to implement is keyright subset to implement is key

Ensuring automated maintenance is keyEnsuring automated maintenance is key New data unfamiliar to the usersNew data unfamiliar to the users Demographics data – initial rollout a fiasco; queries Demographics data – initial rollout a fiasco; queries

produce wildly inaccurate numbers because the data is produce wildly inaccurate numbers because the data is not well enough understoodnot well enough understood

Lots of churning while technical team got enough Lots of churning while technical team got enough understanding of the data to identify the problemsunderstanding of the data to identify the problems

User confidence in the data seriously hurtUser confidence in the data seriously hurt Education program in the data and what queries would Education program in the data and what queries would

generate correct results requiredgenerate correct results required

Page 31: April 2004 - IRM Roundtable - Part 2 - Metadata

31

Demographic Data:Demographic Data:Conceptual ModelConceptual Model

Make sure you know at what level the data applies!

has

has

has

is part of the key to a /has

consists of

has one or more

last name is part of the key to a

contains one or more

may play a role as more than one

can have up to 10 (7 active)

may have more than 1

may have many characteristics of

may have one or more /has a primary member

Geographic Area

Geographic Area Demographics

Physical Address Household

Household Demographics

Consumer

Individual Demographics

Name

Household Member

Account

Screen Name

Lifestyle

Customer

Member

Prospect

Page 32: April 2004 - IRM Roundtable - Part 2 - Metadata

32

The Need and Promise is The Need and Promise is Great.Great.

The delivery isn’t there yet.

The

Perfect

Repository

The

Perfect

Repositoryoo

oo oo

• They often take more effort to feed than the benefit derived

• Many repository tools/vendors won’t expose (share) their metadata

• They tend to be passive, and thus can get out of synch with the real world

Page 33: April 2004 - IRM Roundtable - Part 2 - Metadata

33

How to implement?How to implement?

““It’s like pinning jello to It’s like pinning jello to the wall”the wall”

There are no “best There are no “best practices”practices”

Are there analogies we Are there analogies we can use?can use?

Page 34: April 2004 - IRM Roundtable - Part 2 - Metadata

34

Layers & Perspectives of Data & Layers & Perspectives of Data & MetadataMetadata

SchemaDescriptionLayer

SchemaLayer

DictionaryLayer

Data inProductionDatabase

Example Instances ofEntities and Relationships

with Data Describingthe Real World:

Entities,Relationships

Attributes:

Entity-Types,Relationship-Types,

Attribute-Types:

Meta-Entity Types:

229-21-5941

0285762

(Personnelrecord forJohn Jones)

(Payrollrecord forPam Smith)

Department-24037

Department-74941

(Payroll recordfor John Jonescontains0285762)

(Attributes do notappear as discreteinstances in a productiondatabase. They provideinformation used in the IRDSto represent real-world entitiesand attributes)

Employee-ID-Number

Social-Security-

Number

Personnel-Record

Payroll-Record

Payroll-RecordCONTAINS-Employee-ID

9 (Characters)

1 (Low-of-Range)10 (High-of-Range)

Finance-Department

Personnel-Department

ELEMENT RECORD USERRECORD-CONTAINS-ELEMENT

LENGTH

ALLOWABLE-RANGE

Entity-TypeRelationship-

TypeAttribute-Typeand Attribute-Group-Type

The 1984/5 (+/-) Information Resources Dictionary Standard (IRDS) was an The 1984/5 (+/-) Information Resources Dictionary Standard (IRDS) was an attempt to define a syntax for metadata exchange.attempt to define a syntax for metadata exchange.

Page 35: April 2004 - IRM Roundtable - Part 2 - Metadata

35

Simple Metadata ModelSimple Metadata Model

PhysicalEnvironment

Data Element

Organization

CreatorsOwners/

ProponentsUsers

File/Table

Occurrences &Data Lineage(sources &

targets)

RelationalEdits

Domain/Allowable

ValuesTransformationRules

Programs

Triggers

Time Events

Location

IntegrityRules

Reports

Page 36: April 2004 - IRM Roundtable - Part 2 - Metadata

36

John Zachman’s Enterprise Architecture John Zachman’s Enterprise Architecture Framework also provides us a way to Framework also provides us a way to categorize the sources of metadata.categorize the sources of metadata.

Page 37: April 2004 - IRM Roundtable - Part 2 - Metadata

37

The Enterprise Architecture The Enterprise Architecture defines five views and six defines five views and six

aspects of the enterprise.aspects of the enterprise. View Data Function Network People Time Motivation

Planner Subject AreaList

BusinessProcess List

BusinessLocation

OrganizationList

SignificantEvents

BusinessGoals List

Owner E-R Diagram Functional FlowDiagram

Logistic Network OrganizationChart

MasterSchedule

BusinessPlan

Designer Data Model Data FlowDiagram

DistributionSystemArchitecture

HumanInterfaceArchitecture

ProcessingStructure

KnowledgeArchitecture

Builder Data Design Structure Chart SystemArchitecture

HumanTechnologyArchitecture

ControlStructure

KnowledgeDesign

Sub-Contractor

DatabaseDescription

LanguageStatement

NetworkArchitecture

Security TimingDefinition

KnowledgeDefinition

Enterprise Data Function Communications Organization Schedule Strategy

• John Zachman compares delivering technology to the enterprise to building an airplane. It is a complicated task involving several stages of design and many builders whose activities must be coordinated. He asks: Who would build an airplane without conceptual drawings? Without detailed sub-assembly charts? His premise is that the IT industry often has these design drawings and sub-assembly charts, but fails to file, cross-reference and maintain them. • Not only do IT practitioners need to keep multiple views of the enterprise, the relationships between the cells in the framework must also be tracked. It is important to know which functions use which data elements.

• He says that as builders of data warehouses we have many of these “specification sheets” and he states that they contain metadata.

Page 38: April 2004 - IRM Roundtable - Part 2 - Metadata

38

Over time, repositories and Over time, repositories and metadata management tools have metadata management tools have

changed with, in spite of, or changed with, in spite of, or regardless of, the IT industry regardless of, the IT industry

focus.focus.

IBM RepositoryBrownstoneReltechec

IBM Data GuidePlatinum (bought Brownstone and Reltech)R&O RochadePrism Directory Manager

IBM Visual WarehousePlatinumMicrosoftViasoft (bought Rochade)One MeaningLogic WorksUnisysIntellidexPrism Directory ManagerDovetail

IBM Visual WarehousePlatinum (bought Logic Works)ViasoftOracle (bought One Meaning)Sybase (bought Intellidex)Ardent (bought Prism & Dovetail)Blue AngelPine Cone

2002

IBM Information CatalogCA RepositoryAscential MetaStageViasoft (Rochade)

Enterprise Information Portals (EIPs)

ViadorMyEureka!

IMS DatadictionaryCulinett IDDMSP DataManagerADR Data DictionaryCGI

1980's 1991 1995 1997 1999

Dictionary Repository Client Server DW Y2K Knowledge(?)

Page 39: April 2004 - IRM Roundtable - Part 2 - Metadata

39

SubjectAreas,Cubes

Store

Display, Analyze, Discover

Automate and Manage

Transform

Metadata

ElementsMappingsBusiness Views

Templates

Operationaland

ExternalData

Distribute

DATA

DATA

DATA

Extract

Find and

Understand

Analyze / Architect

Now Lets Look at This From Now Lets Look at This From Another PerspectiveAnother Perspective

All is not lost!There are other tools that help with metadata needs.

Page 40: April 2004 - IRM Roundtable - Part 2 - Metadata

40

From Raw Data to From Raw Data to Standardized Information to Standardized Information to

UsefulnessUsefulnessDeliver

What’s in the source data?Does it mean what you think it should?How is it structured?How might it be structured?

Discover Assess & Monitor

Match &Merge

Enrich &Transform

Does it contain what you think it should?How complete is it?How clean is it?Does it follow the business rules?How is the quality changing over time?

Resolve duplicates.Standardize names.Assign unique Id’s.Identify households.

Correct and improve it.Change to standard values.Transform codes to meaningful terms,Summarize it.

Deliver new sets of data on a periodic basis.Capture changes as required.Deliver updates/new transactions in as timely a fashion as required.

ProfileStage (MetaRecon)Evoke Axio

AuditStage (Quality Manager)Ab Initio Data Profiler

QualityStage (Integrity)TrilliumFirst DataInnovative Systems

DataStageInformatica

Ab Initio Warehouse Manager

ETL tools, Propagation, Change Data Capture tools,MQ

SampleTools:

TheProblems:

Page 41: April 2004 - IRM Roundtable - Part 2 - Metadata

41

Data Becomes Data Becomes InformationInformation

If and Only If You:If and Only If You:

1.1. HaveHave the data the data andand

2.2. KnowKnow you have it you have it andand

3.3. Can Can accessaccess it it andand

4.4. Can Can useuse it it andand

5.5. Can Can trusttrust it! it!

Tools, Techniques, & ProcessesThe Problem

1.1. Capture Process; Capture Process; Business Process Re-Business Process Re-EngineeringEngineering

2.2. Metadata; EvangelismMetadata; Evangelism

3.3. BI Environment; Data Structured BI Environment; Data Structured for Access; End-User Analysis for Access; End-User Analysis toolstools

4.4. Business Metrics CapturedBusiness Metrics Captured

5.5. Data Quality Process; Data Quality Process; MetadataMetadata