order out of chaos: creating and valuing taxonomies information highways conference e-content...
TRANSCRIPT
Order Out of Chaos:Creating and Valuing
Taxonomies
Information Highways Conferencee-Content Institute
April 6, 2005
Information Highways, April 6, 2005 © Denise Bruno 2
Agenda Fun! Controlled vocabularies Value of taxonomies Types of taxonomies Taxonomy development
Information Highways, April 6, 2005 © Denise Bruno 3
“The value of knowledge is largely tied to the way in which that knowledge is organized. If you can’t find it, it’s not likely to be of much use to you.”
Marc RapportUnfolding Knowledge
Knowledge Management E-zine
Exercise
Put the slips in some sort of order so that they are of use
to you.
Information Highways, April 6, 2005 © Denise Bruno 5
Taxonomies, Metadata and Classification
8-week courseProfessional Learning Centre
Faculty of Information StudiesUniversity of Toronto
http://plc.fis.utoronto.ca/coursedescription.asp?courseid=165
Bonus: Intranet Taxonomy Resource Centre
Controlled Vocabularies
Information Highways, April 6, 2005 © Denise Bruno 7
Definitions Controlled Vocabulary
An indexing language, i.e., a standardized set of terms and phrases authorized for use in an indexing system to describe a subject area or information domain.
A collection of preferred and non-preferred terms that are used to assist in more precise retrieval of content.
Information Highways, April 6, 2005 © Denise Bruno 8
Purpose Translation
From natural language of authors and users into a vocabulary used for indexing and retrieval
Consistency In the assignment of index terms
Indication of Relationships Semantic relationships among terms
Retrieval Searching aid in retrieval of documents
(source: ANSI/NISO Z39.19-2003, Guidelines for the Construction, Format, and Management of Monolingual Thesauri)
Information Highways, April 6, 2005 © Denise Bruno 9
Types of Controlled Vocabularies
dd
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 10
Pick List A list of words Most basic of controlled
vocabularies No synonyms identified No guidance provided
Information Highways, April 6, 2005 © Denise Bruno 11
Synonym Ring A list of words to be treated as
equivalent in meaning for the purposes of searching
Every term in the ring in synonymous to the others
Information Highways, April 6, 2005 © Denise Bruno 12
Synonym Ring
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 13
Authority File Provides higher level of control than
a synonym ring Designates one term as being
preferred Includes references from synonyms,
abbreviations, and acronyms to the preferred term
AKA a subject list
Information Highways, April 6, 2005 © Denise Bruno 14
Authority File
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 15
Taxonomy Defines hierarchical relationships
between the terms Goes from the general to the
specific Strict taxonomy is a Genus/species
relationship, i.e. “is a” relationshipe.g. russet “is a” type of potato
Information Highways, April 6, 2005 © Denise Bruno 16
Taxonomy
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 17
Taxonomy “Taxis” – arrange, put in order
“Onoma” – name
Is the end result of the science, laws, or principles of classification
Information Highways, April 6, 2005 © Denise Bruno 18
Taxonomy (From Greek “taxis” meaning arrangement or division and
“nomos” meaning law) is the science of classification according to a pre-determined system, with the resulting catalog used to provide a conceptual framework for discussion, analysis, or information retrieval.
In theory, the development of a good taxonomy takes into account the importance of separating elements of a group (“taxon”) into subgroups (“taxa”) that are mutually exclusive, unambiguous, and taken together, include all possibilities.
In practice, a good taxonomy should be simple, easy to remember, and easy to use.
(source: www.whatis.com)
Information Highways, April 6, 2005 © Denise Bruno 19
Taxonomy “Structures that provide a way of classifying
things – living organisms, products, books – into a series of hierarchical groups to make them easier to identify, study, or locate. Taxonomies consist of two parts – structures and applications. Structures consist of the categories (or terms) themselves and the relationships that link them together. Applications are the navigation tools available to help users find information.”
(source: Jean Graef, Montague Institute)
Information Highways, April 6, 2005 © Denise Bruno 20
Thesaurus A type of controlled vocabulary
that shows the following relationships among terms: hierarchical (e.g. parent-child BT, NT) associative (e.g. related RT) equivalent (e.g. synonymous U, UF)
Also includes scope notes (definitions)
Information Highways, April 6, 2005 © Denise Bruno 21
Thesaurus
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 22
User Warrant Justification for the representation of a
concept in an indexing language or for the selection of a preferred term because of frequent requests for information on the concept or free-text searches on the term by users of an information storage and retrieval system(source: ANSI/NISO Z39.19-2003, Guidelines for the Construction, Format, and Management of Monolingual Thesauri)
Information Highways, April 6, 2005 © Denise Bruno 23
Classification Classification refers to the systematic grouping of
like things or objects into classes or categories according to some shared quality or characteristic.
Implies the separation of things according to their degree of unlikeness.
The term “classification” can refer either to the process of defining the categories and structure of a classification scheme or to the process of assigning documents to their appropriate categories.
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 24
Classification Scheme A scheme for arranging a collection of
information in a hierarchical order using a controlled vocabulary to express the categories.
Frequently referred to as a “taxonomy”.
Also known as a file plan.
(source: U of T, Professional Learning Centre, Intranet Taxonomy Resource Centre)
Information Highways, April 6, 2005 © Denise Bruno 25
Metadata Data about data “Metadata is structured
information that describes, explains, locates, or otherwise makes is easier to retrieve, use, or manage an information resource.”
(source: ITRC)
Information Highways, April 6, 2005 © Denise Bruno 26
Important A taxonomy describes the domain
(e.g. subject) being used for classification, but is not itself metadata However, it can be used in metadata
Does not address naming conventions for individual files (records) Separate policy/procedure
Information Highways, April 6, 2005 © Denise Bruno 27
Value of Taxonomies
Information Highways, April 6, 2005 © Denise Bruno 29
“…the primary motives for developing an internal taxonomy were to improve information access and to save time by streamlining the search process.”
Taxonomies for Business:Access and Connectivity in a Wired World, TFPL Ltd.
Information Highways, April 6, 2005 © Denise Bruno 30
Information Environment Paper Facsimiles Electronic docs Email Chat boards White boards Legacy databases
Instant messaging Intranet materials Internet materials Workflow Video Audio Microforms
Information Highways, April 6, 2005 © Denise Bruno 31
Information Environment No standards for info design or else too
vague or incapable of being enforced Separate offices/divisions, many with
own IT shops, build separate info systems
Cultures of competitiveness or mistrust Legacy systems difficult to change Managers still looking for silver bullet
Information Highways, April 6, 2005 © Denise Bruno 32
Value of Taxonomies Identification – Controls the glut of information by
filtering, categorizing and labeling information Navigation – Reduces the likelihood of becoming
lost by moving along logical paths; facilitates browsing
Discovery – Aids the serendipitous find, new associations via inference
Searching – Provides context, reduces search time, improves search engine performance
Delivery – Improves retrieval, for both browsing and free text searches
Types of Taxonomies
Information Highways, April 6, 2005 © Denise Bruno 34
Structural Model - Hierarchies
Generic (Genus/Species)
“is – a” kind of relationship
Mutual exclusivity Strictest of
hierarchies
(source: Barbara Kwasnik, The Role of Classification in Knowledge Representation and Discovery, Library Trends, Summer 1999, pp.22-47)
Eye Diseases Conjunctival Diseases
Conjunctival Neoplasm
Conjunctivitis Keratoconjunctivi
tis Corneal Diseases
(from MeSH)
Information Highways, April 6, 2005 © Denise Bruno 35
Structural Model - Hierarchies
Whole-Part
Does not assume genus/species
One-way flow of information
Websites/directories
Automobile Body Engine Block
Pistons Valves
Interior Upholstery
Information Highways, April 6, 2005 © Denise Bruno 36
Structural Model - Hierarchies
Musical Instruments
Stringed PercussionInstruments Instruments
Pianos
Polyhierarchical
Concepts belong to more than one category
Information Highways, April 6, 2005 © Denise Bruno 37
Emphasis of Taxonomy Department Subject/Topic
For a discrete body of knowledge Familiar to most users
Product/Services Internal or external focus
Audience User-centric
Geography/Location
Information Highways, April 6, 2005 © Denise Bruno 38
Information Highways, April 6, 2005 © Denise Bruno 39
Information Highways, April 6, 2005 © Denise Bruno 40
Information Highways, April 6, 2005 © Denise Bruno 41
Information Highways, April 6, 2005 © Denise Bruno 42
Information Highways, April 6, 2005 © Denise Bruno 43
Information Highways, April 6, 2005 © Denise Bruno 44
Information Highways, April 6, 2005 © Denise Bruno 45
Emphasis of Taxonomy Function
Functions represent the major responsibilities that are managed by the organization to fulfill its goals
Source of information Government of Canada, Information
Management Services, BASCS (Business Activity Structure Classification System)
(http://www.collectionscanada.ca/information-management/0630_e.html)
Information Highways, April 6, 2005 © Denise Bruno 46
Function Taxonomy Example
Collection Part
Section Primary
Secondary
Collection 2: ABC Company Management Part 3: Financial Management
Section 05: Financial Reporting and Auditing Primary 03: Audit Working Papers (2-3-05-03)
Secondary 01: Audit Confirmations (2-3-05-03-01)
Whole-Part Example:Function-based
Information Highways, April 6, 2005 © Denise Bruno 47
“Though figuring out where to start can be frustrating, a good taxonomy is recognized as a central part of a knowledge management system.”
Thomas TrimmerPresident, GrapeVine Technologies
Taxonomy Development
Information Highways, April 6, 2005 © Denise Bruno 49
High-level Overview Domain and Purpose Rules Data Gathering Develop Draft Taxonomy Consult & Test Refine & Finalize Document Train & Educate Users Ensure Continued Development
Information Highways, April 6, 2005 © Denise Bruno 50
IMPORTANT!
Project Process
There is no “end”.A taxonomy is never “finished”.
Information Highways, April 6, 2005 © Denise Bruno 51
Denise BrunoAssociate
CONDAR Consulting [email protected]
905-642-5596