slide 1 governa… · ppt file · web viewdata governance maturity: when the business depends on...

101
Strategies LLC Taxonomy Sept. 10, 2008 Copyright 2008Taxonomy Strategies LLC. All rights reserved. Data Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008 Ron Daniel, Jr.

Upload: dinhhanh

Post on 05-Apr-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

Strategies LLCTaxonomy

Sept. 10, 2008 Copyright 2008Taxonomy Strategies LLC. All rights reserved.

Data Governance Maturity: When the business depends on clear

description of fuzzy objects

Presented to San Francisco DAMA

Sept. 10, 2008

Ron Daniel, Jr.

Page 2: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

2Taxonomy Strategies LLC The business of organized information

Bio: Ron Daniel, Jr.

Over 15 years in the business of metadata & automatic classification Principal, Taxonomy Strategies Standards Architect, Interwoven Senior Information Scientist, Metacode Technologies (acquired by

Interwoven, November 2000) Technical Staff Member, Los Alamos National Laboratory

Metadata and taxonomies community leadership. Chair, PRISM (Publishers Requirements for Industry Standard

Metadata) working group Acting chair, XML Linking working group Member, RDF working groups Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2

reports.

Page 3: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

3Taxonomy Strategies LLC The business of organized information

Recent & current projects: http://www.taxonomystrategies.com/html/clients.htm

Government Commercial

Not-for-Profit

Page 4: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

4Taxonomy Strategies LLC The business of organized information

Goals for this talk

Provide you with background on maturity models.

Provide the results of our surveys of Search, Metadata, & Taxonomy practices and discuss interesting findings.

Review the practices in use at stock photo houses, and compare them to methods that may be used in typical information management projects.

Give you the tools to do a simple self-assessment of your organization’s metadata maturity

Page 5: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

5Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 6: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

6TAXONOMY STRATEGIES The business of organized information

Metadata Definitions

Page 7: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

7Taxonomy Strategies LLC The business of organized information

Taxonomy and metadata definitions

Metadata “Data about data”. Different communities have very different assumptions

about they types of data being described. I’m from the Information Science community, not the database,

statistics, or massive storage communities.

Taxonomy1.The classification of organisms in an ordered system

that indicates natural relationships. 2.The science, laws, or principles of classification;

systematics. 3.Division into ordered groups, categories, or hierarchies.

Page 8: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

8Taxonomy Strategies LLC The business of organized information

Examples of taxonomy used to populate metadata fields

Metadata

Title

Author

Department

Audience

Topic

Topics

Employee ServicesCompensationRetirementInsuranceFurther Education

Finance and BudgetProducts and ServicesSupport Services

InfrastructureSupplies

Metadata Values(Facets within the overall Taxonomy)

Audience

InternalExecutivesManagers

ExternalSuppliersCustomersPartners

Page 9: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

9Taxonomy Strategies LLC The business of organized information

Example faceted taxonomy

ABC Computers.com

AllBusinessABC EmployeeEducationGaming Enthusiast

HomeInvestorJob SeekerMediaPartnerShopper

First TimeExperiencedAdvanced

Supplier

Audience

AllHome & Home Office

GamingGovernment, Education & Healthcare

Medium & Large Business

Small Business

Line of Business

AllAsia-PacificCanadaABC EMEAJapanLatin America & Caribbean

United States

Region-Country

DesktopsMP3 PlayersMonitorsNetworkingNotebooksPrintersProjectorsServersServicesStorageTelevisionsNon-ABC Brands

Product Family

AwardCase StudyContract & Warranty

DemoMagazineNews & EventProduct Information

ServicesSolutionSpecificationTechnical NoteToolTrainingWhite PaperOther Content Type

Content Type

Business & Finance

Interpersonal Development

IT Professionals Technical Training

IT Professionals Training & Certification

PC ProductivityPersonal Computing Proficiency

Competency Industry

Banking & Finance

Communica-tions

E-BusinessEducationGovernmentHealthcareHospitalityManufacturingPetro-chemocals

Retail / Wholesale

TechnologyTransportationOther Industries

Service

Assessment, Design & Implementation

DeploymentEnterprise Support

Client Support

Managed Lifecycle

Asset Recovery & Recycling

Training

Page 10: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

10Taxonomy Strategies LLC The business of organized information

Manually tagged metadata sample

Attribute ValuesTitle Jupiter’s Ring System

URL http://ringmaster.arc.nasa.gov/jupiter/

Description Overview of the Jupiter ring system. Many images, animations and references are included for both the scientist and the public.

Content Types Web Sites; Animations; Images; Reference Sources

Audiences Educators; Students

Organizations Ames Research Center

Missions & Projects Voyager; Galileo; Cassini; Hubble Space Telescope

Locations Jupiter

Business Functions Scientific and Technical Information

Disciplines Planetary and Lunar Science

Time Period 1979-1999

Page 11: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

11Taxonomy Strategies LLC The business of organized information

Other things sometimes called Taxonomy

Type RemarksSynonym Ring Connects a series of terms together

Treats them as equivalent for search purposese.g (Dog, Canine, Pooch, Mutt) (Cat, Feline, Kitty), …

Authority File Used to control variant names with a preferred term Typically used for names of countries, individuals, organizationse.g. (IBM, Big Blue, International Business Machines Inc.)

Classification Scheme

A hierarchical arrangement of terms May or may not follow strict “is-a” hierarchy rules Usually enumerated; ie, LC or Dewey

Thesaurus Expresses semantic relationships of: • Hierarchy (broader & narrower terms)• Equivalence (synonyms) • Associative (related terms)

May include definitions

Ontology Resembles faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules

A model of reality, allowing inferences to be made.

Page 12: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

12Taxonomy Strategies LLC The business of organized information

Pop Quiz

On a blank piece of paper:

• What question(s) did you want to have answered by coming to today’s talks?

Flag one question to be discussed later.

You do NOT have to provide your name.

Please DO provide your job title, division, and either company name or company type.

Page 13: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

13Taxonomy Strategies LLC The business of organized information

What do other people ask about?

How to build a taxonomy? Definitions of terms. How to govern its use and

maintenance? What’s the ROI? What are they for? How do we put them to

use? How do we link them to

content? How do they help search?

How do I sell management on a taxonomy project?

How do we maintain them?

and many more…

development

definitions

governance

ROI

basic taxo purpose

usage

tagging

search

selling

maint

Page 14: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

14Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 15: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

15TAXONOMY STRATEGIES The business of organized information

Motivation behind the Metadata Maturity Model

Page 16: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

16Taxonomy Strategies LLC The business of organized information

Organizational benchmarking

A common goal of organizations is to ‘benchmark’ themselves against other organizations.

Different organizations have: Different levels of sophistication in their planning,

execution, and follow-up for CMS, Search, Portal, Metadata, and Taxonomy projects.

Different reasons for pursuing Search, Metadata, and Taxonomy efforts

Different cultures

Benchmarks should be to similar organizations.

Page 17: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

17Taxonomy Strategies LLC The business of organized information

Is unnecessary capability harmful?

Tool Vendors continue to provide ever-more capable tools with ever-more sophisticated features. But we live in a world where a significant fraction of

public, commercial, web pages don’t have a <title> tag. Organizations that can’t manage <title> tags stand a

very poor chance of putting an entity extractor to use, which requires some ongoing management of the lists of entities to be extracted.

Organizations that can’t create and maintain clean metadata can’t put a faceted search UI to good use.

Unused capability is poor value-for-money. Organizations over-spend on tools and under-spend on

staff & processes.

Page 18: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

18Taxonomy Strategies LLC The business of organized information

Towards better benchmarking…

Wanted a method to: Generally identify good and bad practices. Help clients identify the things they can do, and the things that

stand an excellent chance of failing. Predict likely sources of problems in engagements.

We have started to develop a Metadata Maturity Model, inspired by Maturity Models from the software industry.

To keep the model tied to reality, we are conducting surveys to determine the actual state of practice around search, metadata, taxonomy, and supporting business functions such as staffing and project management.

Page 19: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

19TAXONOMY STRATEGIES The business of organized information

A Tale of Two Software Maturity Models

CMMI (Capability Maturity Model Integration)

vs.

The Joel Test

Page 20: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

20Taxonomy Strategies LLC The business of organized information

CMMI structure

Source: http://chrguibert.free.fr/cmmi

Maturity Models are collections of Practices.

Main differences in Maturity Models concern:

• Descriptivist or Prescriptivist Purpose• Degree of Categorization of Practices• Number of Practices (~400 in CMMI)

Page 21: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

21Taxonomy Strategies LLC The business of organized information

22 Process Areas, keyed to 5 Maturity Levels… Process Areas contain Specific

and Generic Practices, organized by Goals and Features, and arranged into Levels

Process Areas cover a broad range of practices beyond simple software development

CMMI Axioms:Individual processes at higher levels are AT RISK from supporting processes at lower levels.A Maturity Level is not achieved until ALL the Practices in that level are in operation.

Page 22: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

22Taxonomy Strategies LLC The business of organized information

CMMI Positives Independent audits of an organization’s level of maturity are a common

service Level 3 certification frequently required in bids

“…compared with an average Level 2 program, Level 3 programs have 3.6 times fewer latent defects, Level 4 programs have 14.5 times fewer latent defects, and Level 5 programs have 16.8 times fewer latent defects”.

Michael Diaz and Jeff King – “How CMM Impacts Quality, Productivity,Rework, and the Bottom Line”

‘If you find yourself involved in product liability litigation you're going to hear terms like "prevailing standard of care" and "what a reasonable member of your profession would have done". Considering the fact that well over a thousand companies world-wide have achieved level 3 or above, and the body of knowledge about the CMM is readily available, you might have some explaining to do if you claim ignorance’.

Linda Zarate in a review of A Guide to the Cmm: Understanding the Capability Maturity Model for Software by Kenneth M. Dymond

Page 23: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

23Taxonomy Strategies LLC The business of organized information

CMMI Negatives

Complexity and Expense Reading and understanding the materials Putting it into action – identifying processes, mapping

processes to model, gathering required data, … Audits are expensive

CMMI does not scale down well to small shops Has been accused of restraint of trade

Page 24: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

24Taxonomy Strategies LLC The business of organized information

At the other extreme, The Joel Test Developed by Joel Spolsky

as reaction to CMMI complexity

Positives - Quick, easy, and inexpensive to use.

Negatives - Doesn’t scale up well:Not a good way to assure the quality of nuclear reactor software.Not suitable for scaring away liability lawyers.Not a longer-term improvement plan.

The Joel Test1. Do you use source control? 2. Can you make a build in one step? 3. Do you make daily builds? 4. Do you have a bug database? 5. Do you fix bugs before writing new code? 6. Do you have an up-to-date schedule? 7. Do you have a spec? 8. Do programmers have quiet working

conditions? 9. Do you use the best tools money can

buy? 10.Do you have testers? 11. Do new candidates write code during

their interview? 12.Do you do hallway usability testing?

Scoring: 1 point for each ‘yes’. Scores below 10 indicate serious trouble.

Page 25: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

25Taxonomy Strategies LLC The business of organized information

What does software development “Maturity” really mean? A low score on a maturity audit DOES NOT mean

that an organization can’t develop good software

It DOES mean that whether the organization will do a good job depends on the specific mix of people assigned to the project

In other words, it sets a floor for how bad an organization is likely to do, not a ceiling on how good they can do Probability of failure is a good thing to know before

spending a lot of time and money

Page 26: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

26TAXONOMY STRATEGIES The business of organized information

Towards a Metadata Maturity Model

Caveats: Maturity is not a goal, it is a characterization of an

organization’s methods for achieving its core goals.

Mature processes impose expenses which must be justified by consequent cost savings, revenue

gains, or service improvements.

Nevertheless, Maturity Models are useful as collections of best practices and stages in which to try to adopt

them.

Page 27: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

27Taxonomy Strategies LLC The business of organized information

Basis for initial maturity model

CEN study on commercial adoption of Dublin Core

Small-scale phone survey Organizations which have world-class search and

metadata externally Not necessarily the most mature overall processes or

the best internal search and metadata

Literature review

Client experiences

Structure from software maturity models

Page 28: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

28Taxonomy Strategies LLC The business of organized information

Initial Metadata Maturity Model (ca. May, 2005)

Practice Area Maturity Level

Basic Intermediate Advanced Bleeding- Edge

Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best BetsSimple Grouping

Intranet Facet NavigationImproved Ranking

Metadata and taxonomy standards

System MD Stds. Organization MD Std.Reuse ERP

Multipe Repos ComplyTaxonomy Roadmap

Highly Abstract Subject Taxos.

Tools and tool selection

Requirements, then Tools

Bakeoff Datasets Budget for Bakeoffs Unneeded Capabils.Tools, then Reqs.

Staff training and hiring

Search Analyst Role Librarian Expertise Pre-hire Testing SME Catalogers

Data creation and QA CM Introduced ROT-Eliminatiion Hybrid Creation Model Adaptive QualificationQuality Measures

Project management Project Plan Std. Proj. Methodol.X-Functional TeamsCommunication PlanMulti-Year Plan

Early Termination

Executive support and ROI

External Search ROI Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

37 Practices, Categorized by Area, Level, and

Importance

Page 29: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

29Taxonomy Strategies LLC The business of organized information

Shortcomings of the initial model

No idea of how it corresponds to actual practice across multiple organizations Some indications that it over-emphasized the sophisticated

practices and under-emphasized beginning practices.

The initial metadata maturity model can be regarded as a hypothesis about how an organization progresses through various practices as it matures How to test it? Let’s ask! Two surveys to date Surveys are being run in stages because of large number of

practices. Ask about future, current, and former practices to gather

information on progression

Page 30: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

30Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 31: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

31TAXONOMY STRATEGIES The business of organized information

Survey 1: Search, Metadata, & Taxonomy Practices

The data in this section comes from a survey conducted in the autumn of 2005.

Page 32: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

32Taxonomy Strategies LLC The business of organized information

Participants by Organization Size

Page 33: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

33Taxonomy Strategies LLC The business of organized information

Participants by Job Role

Page 34: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

34Taxonomy Strategies LLC The business of organized information

Participants by Industry

Page 35: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

35Taxonomy Strategies LLC The business of organized information

Search Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Search Box in standard place on all web pages. 20% (12) 11% (7) 62% (38) 2% (1) 5% (3)Search engine indexes multiple repositories in addition to web sites. 25% (15) 21% (13) 44% (27) 2% (1) 8% (5)

Spell Checking. 31% (19) 18% (11) 38% (23) 0% (0) 13% (8)

Synonym Searching. 41% (25) 23% (14) 30% (18) 0% (0) 7% (4)

Search results grouped by date, location, or other factors in addition to simple relevance score. 37% (22) 20% (12) 37% (22) 0% (0) 7% (4)Queries are logged and the logs are regularly

examined 31% (19) 25% (15) 31% (19) 5% (3) 8% (5)Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. 46% (28) 25% (15) 21% (13) 0% (0) 8% (5)

Advanced computation of relevance based on data in addition to the text of the document. 43% (26) 16% (10) 25% (15) 0% (0) 16% (10)A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. 68% (41) 7% (4) 10% (6) 0% (0) 15% (9)A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal. 57% (34) 15% (9) 17% (10) 0% (0) 12% (7)

Page 36: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

36Taxonomy Strategies LLC The business of organized information

Metadata Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Metadata standards are developed for the needs of each system with no overall attempt to unify them. 22% (13) 12% (7) 37% (22) 20% (12) 10% (6)

An Organization-wide metadata standard exists and new systems consider it during development. 37% (22) 37% (22) 20% (12) 0% (0) 7% (4)The Organization-wide metadata standard is based on the Dublin Core. 52% (30) 16% (9) 21% (12) 0% (0) 12% (7)

Multiple repositories comply with metadata standard. 52% (31) 20% (12) 17% (10) 0% (0) 12% (7)

A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. 48% (29) 20% (12) 20% (12) 0% (0) 12% (7)

The Cataloging Policy document is revised periodically. 48% (29) 15% (9) 17% (10) 0% (0) 20% (12)

A centralized metadata repository exists to aggregate and unify metadata from disparate sources. 57% (34) 17% (10) 17% (10) 0% (0) 10% (6)

Metadata is manually entered into web forms. 15% (9) 12% (7) 61% (36) 3% (2) 8% (5)

Metadata is generated automatically by software. 38% (23) 18% (11) 27% (16) 2% (1) 15% (9)Metadata is generated automatically, then reviewed

manually for correction. 48% (29) 18% (11) 17% (10) 2% (1) 15% (9)

These two questions were the only ones with much correlation to

organization size

Page 37: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

37Taxonomy Strategies LLC The business of organized information

Taxonomy Practices

Not current practice

Being developed In practice

Former practice

NA or Unknown

Org Chart' Taxonomy - One based primarily on the structure of the organization. 36% (21) 10% (6) 34% (20) 5% (3) 15% (9)

'Products' Taxonomy - One based primarily on the products and/or services offered by the organization. 37% (22) 10% (6) 32% (19) 5% (3) 15% (9)

'Content Types' Taxonomy - One based primarily on the different types of documents. 28% (16) 21% (12) 40% (23) 5% (3) 7% (4)'Topical' Taxonomy - One based primarily on topics of interest to the site users. 20% (12) 36% (21) 34% (20) 3% (2) 7% (4)'Faceted' Taxonomy - One which uses several of the approaches above. 32% (19) 29% (17) 34% (20) 0% (0) 5% (3)

The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. 75% (44) 3% (2) 14% (8) 0% (0) 8% (5)

The Taxonomy follows a written 'style guide' to ensure its consistency over time. 47% (28) 22% (13) 20% (12) 0% (0) 10% (6)

The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. 35% (21) 17% (10) 40% (24) 2% (1) 7% (4)

The Taxonomy was validated on a representative sample of content during its development. 28% (17) 22% (13) 33% (20) 3% (2) 13% (8)A Roadmap for the future evolution of the Taxonomy has been developed. 38% (23) 40% (24) 13% (8) 0% (0) 8% (5)

Page 38: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

38TAXONOMY STRATEGIES The business of organized information

Survey 2: Business Drivers, Processes, and Staffing

The data in this section comes from a survey conducted in the spring of 2006.

Page 39: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

39Taxonomy Strategies LLC The business of organized information

Participants by Job Role

Page 40: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

40Taxonomy Strategies LLC The business of organized information

Participants by Tenure

Page 41: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

41Taxonomy Strategies LLC The business of organized information

Participants by Industry

Page 42: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

42Taxonomy Strategies LLC The business of organized information

Participants by Organization Size

Page 43: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

43Taxonomy Strategies LLC The business of organized information

Business Drivers: Search, Metadata, and Taxonomy (SMT) Applications

Page 44: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

44Taxonomy Strategies LLC The business of organized information

Business Drivers: Desired Benefits

1 Innovation2 Core to our business product3 Clients do all the above [From a consultant]4 Better navigation to diverse State web sites5 Increased knowledge sharing across the corporation6 Interoperability7 Dynamic web applications8 Improved user search experience9 Improve R&D

10Higher value to members [From a non-profit membership

org.]11 For organization to have better understanding of their content

Other desired benefits

:

Page 45: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

45Taxonomy Strategies LLC The business of organized information

ROI: Cost Estimation

Page 46: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

46Taxonomy Strategies LLC The business of organized information

Processes

Use of search logs is

improving

Surprisingly sophisticated

Basic data quality and communications need improvement

Many solo operators

Page 47: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

47Taxonomy Strategies LLC The business of organized information

Team Structures & Staffing

Page 48: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

48Taxonomy Strategies LLC The business of organized information

Salary Survey

Experience 0.6 Nice to see it really counts.Geography 0.5 California and the Northeast have highest salaries.Co. Size 0.5 Not very reliable, big changes from one datapointEducation 0.4 Many taxonomists have MLS or above.

Industry 0.4 Surprisingly, retail has high salaries for taxonomists.Role 0.04 Taxonomists paid about like Information Architects

Time at current job -0.07

Page 49: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

49Taxonomy Strategies LLC The business of organized information

Notes from Participants

There is the constant struggle with individual [magazine] titles to hire trained librarians or data specialists instead of trying to save money by hiring an editor who can build articles AND create and assign metadata. This is a governance issue we have been struggling with since we have no monetary stake in the individual publications. We make recommendations, but have no higher level authority to require titles to hire trained staff for metadata.

Reporting metrics have become a new area of confusion as we move to portalized pages consisting of objects in portlets, each with their own metadata.

Key organizational issue is that the "problems" that stem from lack of systematic metadata/taxonomy creation are not "owned" by anyone, and consequently have no budget for their solution.

Page 50: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

50TAXONOMY STRATEGIES The business of organized information

Interim Conclusions

Page 51: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

51Taxonomy Strategies LLC The business of organized information

Observations (1)

Practices which a single person or a small group can carry out are more commonly used Not surprising Very different than ERP/BPR, indicates that information

management is not being sold to the “C-level” staff. People need to question how inclusive their

“Organizational Metadata Standards” and “Taxonomy Roadmaps” actually are. We have found Taxonomy Roadmaps to be an advanced

practice, due to a dependence on knowing upcoming IT development schedule

Page 52: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

52Taxonomy Strategies LLC The business of organized information

Observations (2)

Many of the basics are being skipped More organizations doing “Spell Checking” than “Query

Log Analysis”. 69% have a taxonomy change plan, but only 41% have

a plan for revisiting data if the taxonomy changes. 64% have a communications plan, but only 56% have a

website. This seems to be linked to the previous observation –

things that are easy for an individual get done before things that need an organizational effort, despite their level of ‘sophistication’.

Page 53: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

53Taxonomy Strategies LLC The business of organized information

Interim Metadata Maturity Model (ca. May, 2006)

Practice Area Basic Intermediate Advanced Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best Bets

Facet Navigation UI

Metadata and taxonomy standards

System MD Stds.Organization MD Std.

Multipe Repos Comply w/ MD Std.Reuse ERP TaxosTaxo Maint. Doc

Taxonomy RoadmapHighly Abstract Subject Taxos (e.g. “Moods”)Metadata Maint. Doc

Tools and tool selection

Requirements, then Tools Bakeoff Datasets Budget for Bakeoffs Tools, then Reqs.

Staff training and hiring

Librarian or IA ExpertiseSearch Analyst Role

Cross-Functional Taxonomy Creation

Cross-functional taxonomy maint.SME CatalogersPre-hire Testing

Data creation and QA CM Introduced ROT-EliminatiionSemi-auto tagging

Quality Measures

Project management Project PlanX-Functional Teams

Std. Proj. Methodol.Multi-Year PlanCommunication PlanSMT Business Manager, instead of IT Manager

Early Termination

Executive support and ROI

External Search ROISMT in separate silos

Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

Page 54: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

54Taxonomy Strategies LLC The business of organized information

Search and Metadata Maturity Quick QuizBasic1) Is there a process in place to examine query logs?2) Is there a process for adding directories and content to the repository, or do people just

do what they want?3) Is there an organization-wide metadata standard, such as an extension of the Dublin

Core, for use by search tools, multiple repositories, etc.?Intermediate4) Does the search engine index more than 4 repositories around the organization?5) Does the search engine integrate with the taxonomy to improve searches and organize

results?6) Are there hiring and training practices especially for metadata and taxonomy positions?7) Is there an ongoing data cleansing procedure to look for ROT (Redundant, Obsolete,

Trivial content)?8) Are tools only acquired after requirements have been analyzed, or are major purchases

sometimes made to use up year-end money?Advanced9) Are there established qualitative and quantitative measures of metadata quality?10) Can the CEO explain the ROI for search and metadata?

Page 55: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

55Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 56: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

56Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 57: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

57Taxonomy Strategies LLC The business of organized information

Stock Photo Business

Advertising, Editorial Content, Corporate Communications, and many other types of content rely on images to convey information and moods.

When time and/or budget does not allow a commissioned shoot, stock photo houses can supply images.

Fundamental problem for users: How to search for an image that conveys what you want?

Fundamental problem for houses: How to describe images so that users can find them?

Page 58: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

58Taxonomy Strategies LLC The business of organized information

How would you search for this image?

Page 59: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

59Taxonomy Strategies LLC The business of organized information

Tagging by emotions

Page 60: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

60Taxonomy Strategies LLC The business of organized information

“silence”

Conceptual refinement

Objective criteria

Conceptual refinement

Image Rights Criteria

Page 61: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

61Taxonomy Strategies LLC The business of organized information

Clarification: Finger on Lips

Page 62: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

62Taxonomy Strategies LLC The business of organized information

Scrolling through results…

This is more of the mood I’m looking for…

Page 63: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

63Taxonomy Strategies LLC The business of organized information

More like this

Page 64: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

64Taxonomy Strategies LLC The business of organized information

Facets at gettyimages.com

Page 65: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

65Taxonomy Strategies LLC The business of organized information

Key Questions

Getty Images (and Corbis) have put a lot of effort into their websites for image purchase*.

Internal staff at such organizations tell me that their intranets are nowhere near as easy to use. ROI is the reason why. Recall that retail had high salaries for taxonomists,

because the ROI for a better shopping site is so clear.

The front-ends are dependent on data. How is that data governed? How does that differ from how their intranets are governed?

*Licensing, not purchasing, to be pedantic.

Page 66: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

66Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 67: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

67Taxonomy Strategies LLC The business of organized information

Pop Quiz

What is the #1 underused source of quantitative information on how to improve your metadata

and taxonomy?

Query Logs & Click Trails

Page 68: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

68Taxonomy Strategies LLC The business of organized information

Who are the users & what are they looking for?

Only 30-40% of organizations regularly examine their logs.

Sophisticated software available, but don’t wait. 80% of value comes from basic reports

Page 69: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

69Taxonomy Strategies LLC The business of organized information

Query log & click trail examination—Click trail packages iWebTrack NetTracker OptimalIQ SiteCatalyst Visitorville WebTrends

Page 70: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

70Taxonomy Strategies LLC The business of organized information

Query log & click trail examination– Query log

UltraSeek Reporting Top queries Queries with no results Queries with no click-through Most requested documents Query trend analysis Complete server usage

summary

Page 71: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

71TAXONOMY STRATEGIES The business of organized information

Examining the Stock Photo Agencies in Light of the Metadata Maturity Model

Page 72: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

72Taxonomy Strategies LLC The business of organized information

Maturity Model RecapPractice Area Basic Intermediate Advanced Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best Bets

Facet Navigation UI

Metadata and taxonomy standards

System MD Stds.Organization MD Std.

Multiple Repos itories Comply w/ MD Std.Reuse ERP TaxosTaxo Maint. Doc

Taxonomy RoadmapHighly Abstract Subject Taxos (e.g. “Moods”)Metadata Maint. Doc

Tools and tool selection

Requirements, then Tools Bakeoff Datasets Budget for Bakeoffs Tools, then Reqs.

Staff training and hiring

Librarian or IA ExpertiseSearch Analyst Role

Cross-Functional Taxonomy Creation

Cross-functional taxonomy maint.SME CatalogersPre-hire Testing

Data creation and QA CM Introduced ROT-EliminatiionSemi-auto tagging

Quality Measures

Project management Project PlanX-Functional Teams

Std. Proj. Methodol.Multi-Year PlanCommunication PlanSMT Business Manager, instead of IT Manager

Early Termination

Executive support and ROI

External Search ROISMT in separate silos

Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

Page 73: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

73Taxonomy Strategies LLC The business of organized information

Search capabilitiesPractice Area Basic Intermediate Advanced Limiting

Search Capabilities Uniform Search BoxQuery Log Exam.

Index Multiple Repos.Best Bets

Facet Navigation UI

• Uniform Search box: Both provide this.• Query Log Exam: Both gathered logs but had

only semi-formal review processes at time of interviews.

• Index multiple repositories: Both license picture ‘collections’ from disparate sources but bring them together for search and purchase.

• Best Bets: N/A in creative space.• Facet Navigation UI: Used on

gettyimages.com, but not on corbis.com.

Page 74: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

74Taxonomy Strategies LLC The business of organized information

Data standardsPractice Area Basic Intermediate Advanced Limiting

Metadata and taxonomy standards

System MD Stds.Organization MD Std.

Multiple Repos .Comply w/ MD Std.Reuse ERP TaxosTaxo Maint. Doc

Taxonomy RoadmapHighly Abstract Subject Taxos (e.g. “Moods”)Metadata Maint. Doc

• System MD Stds: Both have moved beyond that level.• Organization MD Standard: Both define core metadata

standards with extensions for specific collections.• Multiple repositories comply w/ MD standard:

Collections are tagged to a common core at both vendors, plus extension elements in different collections.

• Reuse ERP taxonomies: N/A• Taxonomy Maint. Doc:• Taxonomy Roadmap: Corbis had plan for facets to be

added, but not keyed to other systems.• Highly abstract vocabularies: Getty shows emotion

tagging in action with their moodstream offering.• Metadata maint. doc: TBD

Page 75: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

75Taxonomy Strategies LLC The business of organized information

Image Collections

Page 76: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

76Taxonomy Strategies LLC The business of organized information

Editorial rules standard

Abbreviations Ampersands Capitalization General…, More…, Other… Languages & character sets Length limits Multiple parents Plural vs. singular form Scope notes Serial comma Sources of terms Spaces Synonyms & acronyms Term order (Alphabetic or …) Term label order (Direct vs.

inverted)…

Rule Name Editorial RuleAbbreviations Abbreviations, other than colloquial

terms and acronyms, shall not be used in term labels.Example: Public InformationNOT: Public Info.

Ampersands The ampersand [&] character shall be used instead of the word ‘and’. Example: Licensing & ComplianceNOT: Licensing and Compliance

Capitalization Title case capitalization shall be used. Example: Customer ServiceNOT: CUSTOMER SERVICENOT: Customer serviceNOT: customer service

General…, More…, Other…

The term labels “General…”, “More…”, and “Other…” shall be used for categories which contain content items that are not further classifiable. Example: “Other Property”

“Other Services”“General Information”“General Audience”

… …

Page 77: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

77Taxonomy Strategies LLC The business of organized information

Tools and Tool Selection

Practice Area Basic Intermediate Advanced Limiting

Tools and tool selection

Requirements, then Tools Bakeoff Datasets Budget for Bakeoffs Tools, then Reqs.

• Requirements, then Tools: Both are well into iterative additions of functionality based on feature requests.

• Bakeoff Datasets: Periodically they look at cataloging tools from outside vendors but none really automate image tagging to a notable degree.

• Budget for Bakeoffs: N/A.• Tools, then Requirements: Neither susceptible given the

amount of custom code.

Page 78: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

78Taxonomy Strategies LLC The business of organized information

Normal taxonomy editor functionality requirements

Hierarchy

Browser

Term Editing

Standard and Custom FieldsStandard and Custom Relations

Data Typing and RestrictionsConsistency Enforcement

Flexible ReportingFlexible Importing?

Basi

c

WorkflowVoting

Change Request Mgmt.Stylistic rules enforcement

ProgrammabilityAdva

nced

UNICODEMultiple Vocabulary SupportInter-Vocabulary Relations

Unique IDsISO Codes not sufficientM

idra

nge

Page 79: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

79Taxonomy Strategies LLC The business of organized information

Staff hiring and trainingPractice Area Basic Intermediate Advanced Limiting

Staff training and hiring

Librarian or IA ExpertiseSearch Analyst Role

Cross-Functional Taxonomy Creation

Cross-functional taxonomy maint.SME CatalogersPre-hire Testing

• Librarian or IA expertise: Both seek this in their cataloging and taxonomy hires, but seek additional things as well.

• Search Analyst: Was goal for Getty at time of interview. Interviewee thought that would take Getty from a “7” to an ”8” in terms of search sophistication.

• Cross-functional taxonomy creation: Not at time of interviews.

• Cross-Functional taxonomy maint: Not at time of interviews.

• SME Catalogers: Yes, esp. Getty Images. Corbis had an art history emphasis, Getty looked for people with variety of backgrounds, esp. science, and photographers.

• Pre-hire testing: Getty did some of this with interns.

Page 80: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

80Taxonomy Strategies LLC The business of organized information

Data creation and QAPractice Area Basic Intermediate Advanced Limiting

Data creation and QA CM Introduced ROT-EliminatiionSemi-auto tagging

Quality Measures

• CM Introduced: Both use strong database systems for cataloging.

• ROT-Elimination: Image collections rarely removed unless licensing problems occur. Both have error detection and error correction processes.

• Semi-auto tagging: Both evaluate this technology periodically but neither has found it usable on images.

• Cross-Functional taxonomy maint: Not at time of interviews.

• Quality measures: Both have quality control processes but neither mentioned analytic models..

Page 81: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

81Taxonomy Strategies LLC The business of organized information

Taxonomy testing methodsMethod Process Who Requires Validation

Walk-thru Show & explain Taxonomist SME Team

Rough taxonomy

Approach Appropriateness to task

Walk-thru Check conformance to editorial rules

Taxonomist Draft taxonomy

Editorial Rules

Consistent look and feel

Usability Testing

Contextual analysis (card sorting, scenario testing, etc.)

Users Rough taxonomy

Tasks & Answers

Tasks are completed successfully Time to complete task is reduced

User Satisfaction

Survey Users Rough Taxonomy

UI Mockup Search

prototype

Reaction to taxonomy Reaction to new interface Reaction to search results

Tagging Samples

Tag sample content with taxonomy

Taxonomist Team Indexers

Sample content

Rough taxonomy (or better)

Content ‘fit’ Fills out content inventory Training materials for people &

algorithms Basis for quantitative methods

Page 82: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

82Taxonomy Strategies LLC The business of organized information

Tests how people think about content, good for exposing ambiguity.

Example from alpha test of a grocery site:

15 Testers put each of 71 best-selling product types into one of 10 pre-defined categories

Categories where fewer than 14 of 15 testers put product into same category were flagged

% of Testers

Cumulative % of Products

With Poly-Hierarchy

15/15 54% 69%14/15 70% 83%13/15 77% 93%12/15 83% 100%11/15 85% 100%

<11/15 100% 100%

“Cocoa Drinks – Powder” is best categorized in both “Beverages” and

“Grocery”.

How to improve? Allow products in multiple categories. (Results are

for minimum size = 4 votes)

82Taxonomy Strategies LLC The business of organized information

Simple method: Closed Card Sort

Page 83: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

83Taxonomy Strategies LLC The business of organized information

User interface survey— Which search UI is ‘better’? Criteria

User satisfaction Success completing tasks Confidence in results Fewer dead ends

Methodology Design tasks from specific to

general Time performance Calculate success rates Survey subjective criteria Pay attention to survey

hygiene: Participant selection Counterbalancing T-scores

Source: Yee, Swearingen, Li, & Hearst

Page 84: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

84Taxonomy Strategies LLC The business of organized information

User interface survey — Results (1)

Which Interface would you rather use for these tasks?

Google-like Baseline

Faceted Category

Find images of roses 15 16

Find all works from a certain period 2 30

Find pictures by 2 artists in the same media 1 29

Overall assessment:Google-like

BaselineFaceted

CategoryMore useful for your usual tasks 4 28

Easiest to use 8 23

Most flexible 6 24

More likely to result in dead-ends 28 3

Helped you learn more 1 31

Overall preference 2 29

Source: Yee, Swearingen, Li, & Hearst

Page 85: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

85Taxonomy Strategies LLC The business of organized information

User interface survey — Results (2)

6.06.7

4.7 4.65.8 5.5 6.0

4.0

7.26.3

3.5

7.7 7.4 7.8

4.8

7.6

0123456789

Faceted CategoryGoogle-like Baseline

Source: Yee, Swearingen, Li, & Hearst

Page 86: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

86Taxonomy Strategies LLC The business of organized information

Document distribution—How evenly does it divide the content?

Documents do not distribute uniformly across categories Zipf (1/x) distribution is expected behavior 80/20 rule in action (actually 70/20 rule)

Measured v Expected Distribution of Top 10 Content Types in Library of Congress Database

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

Congre

sses

Biograph

y

Period

icals

Maps

Fiction

Exhibitio

ns

Juve

nile l

itera

ture

Bibliog

raph

y

Statistic

s

Top 10 Content Types

Num

ber o

f Rec

ords

Leading candidate for splitting

Leading candidates for merging

Page 87: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

87Taxonomy Strategies LLC The business of organized information

Document distribution— How evenly does it divide the content?

Methodology: 115 randomly selected URLs from corporate intranet search index were manually categorized. Inaccessible files and ‘junk’ were removed.

Results: Slightly more uniform than Zipf distribution. Above the curve is better than expected.

Measured v Expected Intranet Content Type Distribution

0

5

10

15

20

25

Peo

ple,

Gro

ups

& P

lace

s

New

s &

Eve

nts

Man

uals

&Le

arni

ngM

ater

ials

Ope

ratio

ns &

Inte

rnal

Com

mun

icat

ions

Mar

ketin

g &

Sal

es

Reg

ulat

ions

,P

olic

ies,

Pro

cedu

res

&Te

mpl

ates

Pap

ers

&P

rese

ntat

ions

Oth

er &

Unc

lass

ified

Pro

gram

s,P

ropo

sals

, Pla

ns&

Sch

edul

es

Content Type

# Do

cum

ents

Page 88: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

88Taxonomy Strategies LLC The business of organized information

Document distribution— How does taxonomy “shape” match that of content? Background: Hierarchical taxonomies allow

comparison of “fit” between content and taxonomy areas

Methodology: 25,380 resources tagged with

taxonomy of 179 terms. (Avg. of 2 terms per resource)

Counts of terms and documents summed within taxonomy hierarchy

Results: Roughly Zipf distributed (top 20

terms: 79%; top 30 terms: 87%) Mismatches between term% and

document% flagged

Term Group%

Terms%

DocsAdministrators 7.8 15.8Community Groups 2.8 1.8Counselors 3.4 1.4Federal Funds Recipients and Applicants

9.5 34.4

Librarians 2.8 1.1News Media 0.6 3.1Other 7.3 2.0Parents and Families 2.8 6.0Policymakers 4.5 11.5Researchers 2.2 3.6School Support Staff 2.2 0.2Student Financial Aid Providers

1.7 0.7

Students 27.4 7.0Teachers 25.1 11.4

Source: Courtesy Keith Stubbs, US. Dept. of Ed.

Page 89: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

89Taxonomy Strategies LLC The business of organized information

Project ManagementPractice Area Basic Intermediate Advanced Limiting

Project management Project PlanX-Functional Teams

Std. Proj. Methodol.Multi-Year PlanCommunication PlanSMT Business Manager, instead of IT Manager

Early Termination

• Project Plan: Both companies are in a mode where maintaining the cataloging, terminology, and search tools is ongoing enhancement. Neither company discussed project management.

• X-Functional Teams: Very little corss-functional involvement was discussed. Some input from sales and cataloging for taxonomy revisions.

• Std. Project Methodology: Not at time of interviews.• Multi-year plan: Not at time of interviews.• Communication Plan: Not discussed.• SMT Business Manager: Not discussed.• Early Termination: Not discussed.

Page 90: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

90Taxonomy Strategies LLC The business of organized information

Key Governance Aspects

Roles and Responsibilities – Managers Reviewers

Policies – For naming Required Fields

Procedures – For reviewing and approving metadata placement For acting on poor metadata application

Page 91: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

91Taxonomy Strategies LLC The business of organized information

Recommended Measure and Improve Mindset Measure - Determine current situation and what is wrong.

• Too many documents in a category? Too many categories? People complaining about not finding material that is on the site? People asking for materials not on the site? Common searches without results?

Decide – Decide how to change things to fix the problem.• Change navigation list? Add new categories? Add synonyms to search? Create

new content?

Confirm – Before rolling out changes, test them to make sure they will improve the problem.

• Usability tests, Card sorts, Internal functionality tests, …

Implement – Roll out the changes.

Repeat – Monitor people’s behavior on the site as well as responding to reported problems.

• Query log examination, Clicktrail examination, Google search result position, Stakeholder feedback, User surveys, Site analytics, etc.

Page 92: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

92Taxonomy Strategies LLC The business of organized information

Taxonomy team: Generic roles

Business Lead

Technical Specialist

Content Specialist

Taxonomy Specialist

Content Owners

Keeps team on track with larger business objectives.

Reality check on process change suggestions.

Balances cost/benefit issues to decide appropriate levels of effort. Obtains needed resources if those on committee can’t accomplish

a particular task.

Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc.

Helps obtain data from various systems.

Committee’s liaison to content creators. Estimates costs of proposed changes in terms of editorial process

changes, additional or reduced workload, etc.

Suggests potential taxonomy changes based on analysis of query logs, indexer feedback.

Makes edits to taxonomy, installs into system with aid of IT specialist.

Stakeholder Committee

Page 93: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

93Taxonomy Strategies LLC The business of organized information

Published Facets

Consuming Applications

IntranetSearch

’’

Web CMS

Archives

ERMS

Custodians

Notifications

Change Requests & Responses

ISO3166-1

Other External

ERP

Other Internal

Vocabulary Management

System

Other Controlled

Items

’’

Intranet Nav.

DAM

Taxonomy governance environment

Taxonomy Governance Environment

CVs

2: Team decides when to update facets within Taxonomy

3: Team adds value via mappings, translations, synonyms, training materials, etc.

1: External vocabularies change on their own schedule, with some advance notice.

4: Updated versions of facets published to consuming applications

CV (Controlled Vocabulary) – The list of values for one facet in the Taxonomy.

Page 94: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

94Taxonomy Strategies LLC The business of organized information

Taxonomy maintenance processes

• Different organizations will have different change processes.• Organization 1: A custodian is responsible for the

content, but checks facts with department heads before making changes.

• Organization 2: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it.

• Organization 3: Analysts suggest changes, editors approve, copyeditors verify consistency.

Page 95: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

95Taxonomy Strategies LLC The business of organized information

Sample taxonomy maintenance workflow

Problem?

Problem?

Yes

Yes

No

No

Suggest new

name/category

Review new name

Taxonomy

Copy edit new name

Add to enterprise Taxonomy

Analyst Editor Copywriter Sys Admin

Taxonomy Tool

Page 96: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

96Taxonomy Strategies LLC The business of organized information

Where taxonomy change suggestions come from

experience

End User

Firewall

Taxonomy

Content TaggingLogic

ApplicationUI Tagging

UI

Tagging Staff

Taxonomy Editor

Staff notes

‘missing’concepts

Query log analysis

Requests from other parts of NASA

experience

End User

Taxonomy Team

FirewallFirewall

Taxonomy

Content TaggingLogic

TaggingLogic

ApplicationUI

ApplicationUI Tagging

UITagging

UI

Tagging Staff

Taxonomy Editor

Staff notes

‘missing’concepts

Query log analysis

Requests from other parts of the

organization

Team Considerations1.Business goals.2.Changes in user

experience.3.Retagging cost.

Recommendations by Editor

1.Small taxonomy changes (labels,

synonyms)2.Large taxonomy changes (retagging, application changes)3.New “best bets”

content.

Application Logic

Page 97: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

97Taxonomy Strategies LLC The business of organized information

Executive SupportPractice Area Basic Intermediate Advanced Limiting

Executive support and ROI

External Search ROISMT in separate silos

Intranet ROI Model CEO knows Search ROI Use it or Lose It Budgets

• External Search ROI: Both Corbis and Getty Images have very clear and compelling ROI stories for external search.

• SMT in separate silos: Both Corbis and Getty images havemoved beyond this practice.

• Intranet ROI model: Not at time of interviews.• CEO knows search ROI: Yes, both Corbis and Getty Images

have CEOs who know the ROI story for external search, but there was not ROI analysis for the intranet at the time of the interviews.

• Use it or lose it budgets: Neither Corbis or Getty Images discussed budget details.

Page 98: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

98Taxonomy Strategies LLC The business of organized information

Agenda

9:15 Metadata Definitions9:30 Maturity Models9:45 Metadata Maturity Model (ca. 2006)10:15 Break10:30 Stock Photo Business10:40 Data Governance Practices in Stock Photo

Agencies11:40 Summary11:45 Questions12:00 Adjourn

Page 99: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

99Taxonomy Strategies LLC The business of organized information

Recommended Reading

CMMI: http://chrguibert.free.fr/cmmi(Official site is http://www.sei.cmu.edu/cmmi/, but that is not the most

comprehensible.)

Joel Testhttp://www.joelonsoftware.com/articles/fog0000000043.html

EIA Roadmaphttp://www.louisrosenfeld.com/presentations/031013-KMintranets.ppt

Enterprise Search Reporthttp://www.cmswatch.com/EntSearch/

Page 100: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

100Taxonomy Strategies LLC The business of organized information

Fun Questions

The animals are divided into:(a) belonging to the emperor,(b) embalmed, (c) tame, (d) sucking pigs, (e) sirens, (f) fabulous, (g) stray dogs, (h) included in the present classification,(i) frenzied, (j) innumerable, (k) drawn with a very fine camelhair brush, (l) et cetera, (m) having just broken the water pitcher, (n) that from along way off look like flies.

Jorge Luis Borges, " THE ANALYTICAL LANGUAGE OF JOHN WILKINS"Works in 3 volumes (in Russian). St. Petersburg, "Polaris", 1994. V. 2: 87.

This was created to be

as bad a classification as possible.

What makes it so bad?

Page 101: Slide 1 governa… · PPT file · Web viewData Governance Maturity: When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008

Strategies LLCTaxonomy

Sept. 10, 2008 Copyright 2008Taxonomy Strategies LLC. All rights reserved.

Contact Info

Ron Daniel, Jr.925-368-8371

[email protected]