controlled vocabulary one thing leads to anotherdlis.du.ac.in/eresources/controlled...
TRANSCRIPT
CONTROLLED VOCABULARY ONE THING LEADS TO ANOTHER
Content Layout
Part One
1. Information Storage and Retrieval 2-5
2. Indexing 6-8
Part Two
1. Introduction (Vocabulary) 10-10
2. Controlled Vocabulary 11-12
3. Purpose of Controlled Vocabulary 14-14
4. With Controlled Vocabulary/ Without
Controlled Vocabulary 15-15
5. Characteristics of Controlled Vocabulary
15-20
6. Controlled Vocabulary Category and Types 21-
23
7. Flat Controlled Vocabulary 24-30
8. Multi-Level Controlled Vocabulary 31-78
9. Relational Controlled Vocabulary 79-97
Part Three
1. Controlled Vocabulary Vs Natural Language 98-
101
2. Conclusion 102-102
3. References 103-104
2
PART ONE CONTROLLED VOCABULARY
ONE THING LEADS TO ANOTHER
Information Retrieval System
1. Introduction:
The term Information retrieval system was coined in 1952 and gained popularity in research
community from 1961 onwards. Since then information retrieval organizing functions was seen in
the Libraries, that were no longer just store house of books but also place where catalogue and
index has been done. Subsequently with the introduction of computer, there appeared a number of
database containing bibliographic detail of a documents, often coupled with documents, keywords
etc consequently the concept of information retrieval come into mean for retrieval of bibliographic
information from stored document database. Laccaster comment that an information retrieval does
not inform the user about their query. It merely informs him of existence (non-existence) of his
request as well as where about of documents relating to his request. Cont…
4
Information Retrieval System
2. IRS subsystem:
F.W Lancaster mentioned that information retrieval system comprises six major subsystem
a) The document subsystem
b) The indexing subsystem
c) The vocabulary subsystem
d) The searching subsystem
e) The user system Interface
f) The machine subsystem Cont…
5
Information Retrieval System
3. The broad outline of an information retrieval system is shown in figure
Information Source
Analysis and representation
Organized Information
User Query
Analyzed
Analyzed queries (search statement)
Retrieved information
Matching
6
Indexing
2. Indexing in brief:
One of the major functions of information retrieval system is to match the content of
documents with user queries. Thus the content of each input document in the collection is to be
analyzed and represented in a such a way that it becomes convenient for matching. In other words,
the system personnel have prepare a surrogate for every document, and all such surrogate must be
maintained in an organized manner. The process of constructing document surrogates by assigning
identifier to text items is known as Indexing. When the task of indexing is based on the conceptual
analysis of the subject of a document it is called subject Indexing.
In Subject classification, the basic objective of which is to arrange documents according to
their Subject contents. Cont…
7
Indexing
In Subject Indexing, the basic objective is to match the content of the document with user
queries and thus the product of conceptual analysis of the subject is represented in a natural
language form. A number of system like Chain, PRECISE, POPSI, Relational Indexing etc have
been developed over the year for preparing subject index entries of documents. Now basic problem
involved in the process of subject indexing relates to choice of appropriate keywords through which
index entry's is to be represented. The indexer preferred to use keywords that not only represent the
subject clearly, but are also likely to be chosen by the user looking for that subject.
In order to standardized the task of choosing appropriate keywords for the generation of index
entries, a number of vocabulary devices have been developed. Example of such devices includes
Thesaurus , Classaurus, Thesaurofacet etc. These tools helps the indexer to choose most Cont…
8
Indexing
appropriate term to represent the subject at indexing stage and also help the user to pick up
most appropriate terms while formulating query. However all these tools techniques based on
human intellectual capabilities of indexer, but it is inefficient in many places. To avoid total
dependency on human intellect, researcher have to attempted to automate the whole process of
subject indexing and classification.
9
PART TWO CONTROLLED VOCABULARY
ONE THING LEADS TO ANOTHER
Vocabulary
1. Introduction:
A vocabulary is a set of terms (words, codes, etc.) that are used in a specific community.
Vocabularies provide a mechanism for communication- be it written, oral or electronic-
because the meaning of the terms are known and agreed upon by the community members.
When a vocabulary is formally managed, it becomes a controlled vocabulary. In this case,
"managed" means the terms are stored and maintained using agreed-upon procedures.
Procedures should exist for adding terms, modifying terms and, more rarely, deprecating terms
from a controlled vocabulary.
11
Controlled Vocabulary
2. What is Controlled Vocabulary?
CV is a carefully selected list of words and phrase, which are used to tag units of
Information. So that they may be more easily retrieved by a search. The terms are chosen by
and organized by trained professionals (including Librarians and information scientist) who
posses expertise in the subject area. CV terms can accurately describe what a given document
is actually about, even if the terms themselves don not occurs within the text. Fully developed
controlled vocabulary systems are LCSH, SERES, Thesaurus etc.
Cont…
12
Controlled Vocabulary
In other words a controlled vocabulary is a collection of terms that are:
a) Accepted: The term must adhere to community practices.
b) Defined: The terms are precisely characterized. Typically, this means the terms have rigorous
definitions.
c) Managed: In general, there will be a body of experts that create and maintain the controlled
vocabulary. The controlled vocabulary maintenance will involve periodic review, addition of
new terms, modification of terms, and occasionally deprecation of terms.
For example in the LCSH (a subject heading system that uses a controlled vocabulary)
authorized terms- subject heading in this case-have to be chosen to handle choice between
variant spelling of the same words, choice among the scientific and popular terms and choice
between synonyms among other different issues.
13
Purpose of Controlled Vocabulary
3. Controlled vocabularies can serve several different purposes:
a) For example, a controlled vocabulary might help users find data (also known as a "discovery vocabulary"), or
b) Assist in the interpretation of data (also known as a "usage vocabulary").
c) The controlled vocabulary might provide human-understandable meaning (also known as a "semantic vocabulary") or
d) Machine-readable format information (also known as a "syntactic vocabulary").
Cont…
Controlled vocabularies provide these abilities by:
a) Establishing the permissible terms to be used;
b) Maintaining the proper and agreed-upon spelling of the terms;
c) Clarifying terms for those who are new to the community; and
d) Eliminating the use of arbitrary terms that can cause inconsistencies and confusion.
From the above we can say that CV ensure consistencies in indexing, tagging or categorizing and to guide the user where the desire information is?
14
With controlled and without controlled Vocabulary
4. Without Control Users are
incorrectly utilizing search terms failing to find significant resources suffering from information overload almost as
well using Alta Vista
Creators are
cataloguing inconsistently
unable to convey hierarchical concepts
Scotland is in United Kingdom is in Europe is in ...
perpetuating localised terminology
unable to assess, let alone undertake, integration projects.
5. With Control
Users might gain more effective access to a resource
gain far more effective access across resources
reduce the number of ‘false hits’
find what they are looking for
even learn to think and express themselves in a structured manner.
Creators might produce more valuable resources
convey complex semantic and structural concepts
move towards disciplinary, national, international or global terminologies
effectively integrate both new and existing resources.
15
Characteristics of Controlled Vocabulary
6. Characteristics of CV:
The most important characteristics of CV is relationship. The terms in CV are related in a certain
ways:
Equivalence Relationship: The most basic term relationship is synonyms, that is having the same
meaning. It is important to note that context is important in determining synonyms. For example
You probably are aware of certain categories or items on your site that might go by multiple names.
You realize that if you said “automobiles” on your homepage and “cars” on the next page, users might
get confused. Users will start to wonder if there is a difference between the two terms. Instead you
choose “automobiles” and don’t use “cars” at all. In this case “automobiles” is the term you prefer to
use throughout your site. We call this the “preferred term.” “Cars” is a variant term, a different word
representing the same concept. Cont…
16
Characteristics of Controlled Vocabulary
There are many examples of the situations that alternate terms cover. Here are a few:
a) Synonyms (two words with the same meaning, like “jeans” and “dungarees”)
b) Homonyms (words that sound the same, but have different meanings, like “bank” the financial institution and “bank” the side of a stream or river)
c) Common misspellings
d) Changes in content (e.g., countries that change their name or have multiple spellings)
e) Identifying “Best Bets” or the most popular pages associated with a certain term (http://www.BBC.com is great at this)
f) Connecting a woman’s married name to her maiden name
g) Connecting abbreviations to the full word (e.g., NY and New York, the chemical symbol Si with the element Silicon)
17
Characteristics of Controlled Vocabulary
Types of Equivalence Relationship:
There are two types of synonym equivalence lists: Synonym rings and Authority files. Synonym rings are generally
used for searching behind the scenes as a way to connect the various terms for a concept. It can be used to say, “when
someone searches for “Si,” give them all documents with both “Si” and ‘Silicon.’” However, what happens when you
want to display one of these terms in your navigation? Then you will need to pick one to be your preferred term. Now,
you have an authority file.
Hierarchical Relationship: Term display a hierarchical relationship when one term is broader in meaning than its child
terms (which has narrower meaning). Pairs of the terms are represented in their super ordinate or subordinate status, the
super ordinate term representing the whole and sub-ordinate term representing a member or part. The super ordinate
term is represented by BT and Subordinate term is by NT.
For example, CAPITAL MARKETS
BT Financial Markets
FINANCIAL MARKETS
NT Capital market
18
Characteristics of Controlled Vocabulary
The standard CV notation used to express hierarchical relationships are NT (narrower term) and BT (broader term). Using this notation, the term “Women’s Pants” would be expressed like this:
Women’s Pants BT Pants NT Casual Pants NT Dress Pants NT Sports Pants
There is a lot you can do with this hierarchical arrangement. It can help you formulate your homepage navigation. It could improve your searching and browsing. It can help users broaden and narrow their search results quickly by showing them where each set of results fits into the site’s hierarchy
Websites e.g. “pants” is the broader term, and the kinds of pants refer to subsets of the whole universe of pants.
19
Characteristics of Controlled Vocabulary
Associative Relationship:
An associative relationship denotes the relationship between the term that is neither hierarchical nor
equivalence, yet the terms are mentally associated to such an extent that link between them should be made
explicit in controlled vocabulary and would revel alternative terms that could be used indexing or retrieval.
It is very difficult to define the relationship between the term and therefore to determine between a pairs of
terms, some guidelines are follow:
The Associative terms are divided into two categories:
i. Terms belonging to the same category. For example ‘Sibling’ with overlapping meaning, such as ‘Ship’ and
‘boats’
ii. Terms belong to different categories: The term should be implied when other is used in indexing for example
An action or product and Product of the action. Such as Programming/Software
20
Controlled Vocabulary Category and Types
7. CV Types and Category:
To many people, the English language is a well-known vocabulary. We have many ways of
representing the terms in the English language. For example, if we want to figure out what a
specific word means we might consult a glossary; if we want to know the origin of a term we
might consult a dictionary; and if we want to know how the term relates to other terms we
might consult a thesaurus. We also need to recognize that the meaning of terms may change
through time. Generations use terms in different ways (cool in one generation means a low
temperature, while cool in another is a positive adjective).
To enable formal management, a controlled vocabulary can be organized in several ways.
There are three broad categories of controlled vocabularies: Cont…
21
Controlled Vocabulary Category
1. Flat Controlled Vocabulary
All flat vocabularies contain a label and a value. Some flat vocabularies build upon this
foundation by adding a definition or additional information about each value.
No relationships are established, no hierarchies are set up, and no complicated matrices are
necessary.
2. Multi-Level Controlled
Vocabulary
A multilevel vocabulary is essentially a way to group terms into classes with hierarchy.
A classification tells more about the terms by placing them into well thought-out
subcategories.
Think of a classification as a tree with a trunk, limbs, branches, and leaves. If you look
at an individual leaf on the tree, you can backtrack to the branch, to the limb, and
eventually to the trunk.
In a multilevel vocabulary, you can examine in which subcategory a term belongs, and
you can examine the relationships between subcategories as well.
3. Relational Controlled
Vocabulary
Relational Vocabularies, also called relationship lists, contain a mechanism to connect
terms. The relations are described by various standards including broader
than/narrower than, used for, and related. In other words provides a set of terms and
captures how they are associated with each other. 22
Classification by Functionality
Broad, Form-based category Functionality based types
1. Flat Controlled Vocabulary 1) Authority File
2) Glossary
3) Dictionary,
4) Gazetteer
5) Code List
2. Multi level Controlled vocabulary 1) Taxonomy
2) Subject Heading
3. Relational Controlled vocabulary 1) Thesaurus
2) Semantic Network
3) Ontology
23
Flat Controlled Vocabulary
8. Flat Controlled Vocabulary types:
Authority File: Authority files are lists of terms that are used to control the variant names for an entity or the domain value for a particular field. Examples include names for countries, individuals, and organizations.
Sometimes within a catalog there are different names or spellings for only one person or subject. This can bring confusion since researchers may miss some information. Authority control is used by cataloguers to collocate materials that logically belong together but which present themselves differently. Records are used to establish uniform titles which collocate all versions of a given work under one unique heading even when such versions are issued under different titles, such as different spelling, pen names etc. The unique header can guide users to all relevant information including related or collocated subjects. Authority records can be combined into a database and called an authority file and maintaining and updating these files as well as "logical linkages" to other files within them is the work of librarians and other information cataloguers. Cont…
24
Flat Controlled Vocabulary
For example, in Wikipedia, the subject of Princess Diana is described by an article Diana, Princess of Wales as well as numerous other descriptors, but both Princess Diana and Diana, Princess of Wales describe the same person; an authority record would choose one title as the preferred one for consistency. In an online library catalog, various entries might look like the following
Authority File Heading
Wikipedia Diana, Princess of Wales
U.S. Library of Congress Diana, Princess of Wales,
1961-1997
Biblioteca Nacional de
España
Windsor, Diana, Princess of
Wales
Getty Union List of Artist
Names
Diana, Princess of Wales
English noble and patron,
1961-1997
Integrated Authority File
(GND)
GND ID: 118525123
Virtual International
Authority File
VIAF ID: 107032638
25
Flat Controlled Vocabulary
Authority File Heading
Wikipedia Diana, Princess of Wales
U.S. Library of Congress Diana, Princess of Wales,
1961-1997
Biblioteca Nacional de
España
Windsor, Diana, Princess
of Wales
Getty Union List of Artist
Names
Diana, Princess of Wales
English noble and
patron, 1961-1997
Integrated Authority File
(GND)
GND ID: 118525123
Virtual International
Authority File
VIAF ID: 107032638
Generally there are different authority file
headings and identifiers used by different
libraries in different countries, possibly
inviting confusion, but there are different
approaches internationally to try to lessen the
confusion. One international effort to prevent
such confusion is the Virtual International
Authority File which is a collaborative
attempt to provide a single heading for a
particular subject. Cont…
26
Flat Controlled Vocabulary
It is a way to standardize information from different authority files around the world such as the
Integrated Authority File (GND) maintained and used cooperatively by many libraries in
German-speaking countries and the United States Library of Congress. The idea is to create a
single worldwide virtual authority file. For example, the ID for Princess Diana in the GND is
118525123 (preferred name: Diana <Wales, Prinzessin>) while the United States Library of
Congress uses the term Diana, Princess of Wales, 1961-1997, other authority files have other
choices. The Virtual International Authority File choice for all of these variations is VIAF ID:
107032638— that is, a common number representing all of these variations.
27
Flat Controlled Vocabulary
Glossaries: A glossary is a list of terms, usually with definitions. The terms may be from a specific
subject field or from a particular work. The terms are defined within a specific environment and
rarely include variant meanings. Examples include the Environmental Protection Agency (EPA)
Terms of the Environment, Glossary of Library and Information Science.
Dictionaries: Dictionaries are alphabetical lists of words and their definitions. Variant senses are
provided where applicable. Dictionaries are more general in scope than are glossaries. They may
also provide information about the origin of a word, variants (by spelling and morphology), and
multiple meanings across disciplines. While a dictionary may also provide synonyms and through
the definitions, related words, there is no explicit hierarchical structure or attempt to group them by
concept. Example Oxford English Dictionary, Subject Dictionary.
28
Flat Controlled Vocabulary
Gazetteers: A gazetteer is a list of place names. Traditional gazetteers have been published as
books or have appeared as indexes to atlases. Each entry may also be identified by feature type,
such as river, city, or school. An example is the U.S. Code of Geographic Names. Geospatially
referenced gazetteers provide coordinates for locating the place on the earth's surface. The
term gazetteer has several other meanings, including an announcement publication such as a
patent or legal gazetteer. These gazetteers are often organized using classification schemes or
subject categories.
Code List: A code list is a type of flat controlled vocabulary consisting of a set of codes and
meanings used in a specific project.
29
MULTI -LEVEL CONTROLLED VOCABULARY
Multi-Level Controlled Vocabulary
9. Multi-Level Controlled Vocabulary Types:
Subject Headings: This scheme type provides a set of controlled terms to represent the subjects of
items in a collection. Example the prime task of a Library for systematically management of the
document is “CLASSIFICATION” and “CATALOGUING”. The main objective of the subject
cataloguing is to fulfill the subject related needs of the readers. The subject heading scheme helps the
cataloguer/Indexer to summarizing the thought content of the document in to a number of accepted
term.
The subject heading schemes are:
Medical Subject Headings (MeSH)
Library of Congress Subject Headings (LCSH).
SEARS List of Subject Headings.
31
MULTI -LEVEL CONTROLLED VOCABULARY
MEDICAL SUBJECT HEADING
Multi Level Controlled Vocabulary Types
What is MeSH?
The Medical Subject Headings (MeSH) are standardized vocabulary of approximately 20,000 terms that describe the biomedical concepts covered in the Medline database.
MeSH consists of a set of terms or subject headings that are arranged in both an alphabetic and a hierarchical structure.
The MeSH thesaurus is produced by the National Library of Medicine (NLM).
When each article is indexed, an indexer at NLM assigns from 5 to 20 headings describing the concepts covered in the article.
MeSH headings are powerful searching tools. They locate documents by assigned controlled vocabulary, not free text words, and are independent of the occurrence of specific words in any other field.
MeSH headings allow you to retrieve all references to a particular topic, even if different terminology was used in the records. Cont…
33
Multi Level Controlled Vocabulary Types
MeSH includes several special features. Four of the most important are:
1) Subheadings (qualifiers):
These are used to qualify MeSH subject headings to pinpoint the specific aspect of the concept
represented by the subject heading. For example, the heading Liver may be qualified with the subheading
drug-effects ("Liver-drug-effects") to indicate that the article is not about the liver in general, but about
the effect of drugs on the liver.
2) Check Tags:
These are special use descriptors that do not represent subject matter per se but that reflect parameters
or aspects of subject content. Special efforts in indexing assures that these will be included or
"checked" each time they appear as aspects in an item being indexed. The following list of descriptors
must be entered by an indexer for every journal article citation to which they apply.
ANIMAL MALE
HUMANS FEMALE 34
Multi Level Controlled Vocabulary Types
3) Publication Types: These provide an additional means for
classifying the material indexed. Rather than representing the subject content of an article, they characterize the nature of the information or the manner in which it is conveyed, e.g., letter, historical article, retracted publication, clinical conference, etc.
Example….
BIOGRAPHY
CASE REPORTS
CLINICAL CONFERENCE
CLINICAL TRIAL
CLINICAL TRIAL, PHASE I
CLINICAL TRIAL, PHASE II
CLINICAL TRIAL, PHASE III
CLINICAL TRIAL, PHASE IV
COMPARATIVE STUDY
CONGRESSES
CONSENSUS DEVELOPMENT
CONFERENCE
CONSENSUS DEVELOPMENT
CONFERENCE, NIH
CONTROLLED CLINICAL TRIAL
75941
EDITORIAL
ENGLISH ABSTRACT
EVALUATION STUDIES
GUIDELINE
HISTORICAL ARTICLE
IN VITRO
JOURNAL ARTICLE
LETTER
META-ANALYSIS
MULTICENTER STUDY
NEWS 115627
PRACTICE GUIDELINE
RANDOMIZED CONTROLLED
TRIAL
RESEARCH SUPPORT, N.I.H.,
EXTRAMURAL
RESEARCH SUPPORT, N.I.H.,
INTRAMURAL
RESEARCH SUPPORT, NON-U.S.
GOV'T
RESEARCH SUPPORT, U.S.
GOV'T, NON-P.H.S.
RESEARCH SUPPORT, U.S.
GOV'T, P.H.S.
RETRACTED PUBLICATION
REVIEW
TWIN STUDY
35
Multi Level Controlled Vocabulary Types
4) Age Group Headings:
There is a collection of age group heading in
MeSH which are assigned whenever someone
in that age group is noted in the paper. All age
groups listed in the paper are indexed. Thus a
clinical trial involving 50 patients, the
youngest of whom was 18 y.o. will be
assigned the age heading “Adolescent” as
well as all the applicable adult headings. Age
groups are very rarely assigned to the major
MeSH field.
Example
Age Groups
Adolescent (age 13-18)
Adult (age 19-44)
Aged (age 65-79)
Aged, 80 and over
Frail Elderly
Middle Aged (age 45-64)
Child (age 6-12)
Child, Preschool (age 2-5)
Infant (age 1-23 months)
Infant, Newborn (birth to 1 month)
36
Multi Level Controlled Vocabulary Types
Special Features of MeSH
Here is part of a Medline Record from the EBSCOhost system:
----------------------------------------------------------------------------------------------------------------------------- ------------
Title: Diagnosis of ventilator-associated pneumonia.
Author(s): Kollef MH
Source: The New England Journal Of Medicine [N Engl J Med] 2006 Dec 21; Vol. 355 (25), pp. 2691-3.
Publication Type: Comment; Editorial; Research Support, Non-U.S. Gov't
Language: English
MeSH Terms:
Bronchoalveolar Lavage*
Anti-Bacterial Agents/*therapeutic use
Pneumonia, Ventilator-Associated/*diagnosis
Bronchoalveolar Lavage Fluid/microbiology; Drug Resistance, Bacterial; Humans;
Pneumonia, Ventilator-Associated/drug therapy;
Trachea/microbiology
Note that there are two groups of MeSH term: (a) In the upper group the MeSH headings are flagged with an
asterisk, *, which indicates that the term represents a major or central focus of the paper. (b) The headings in the
lower group, which are not flagged, represent secondary or minor aspects of the paper. These are subjects or concepts
which are worth noting, but which are not the paper’s primary focus.
37
Multi Level Controlled Vocabulary Types
Appendix 1: MeSH Subheadings:
38
MULTI -LEVEL CONTROLLED VOCABULARY
LIBRARY OF CONGRESS SUBJECT HEADING
Multi-Level Controlled Vocabulary Types
What is LCSH?
Library of Congress Subject Heading came in to exist in the
year 1898 by Library of Congress, USA and also maintained by
the same.
LCSH system was originally designed as a controlled
vocabulary for representing the subject and form of the books
and serials in the Library of Congress collection.
Now it is widely accepted by Libraries & Information Centre
around the world.
LCSH is also known as a “Big Red Books”
It consists of five volume and Published annually.
Subject authority records are available online.
At present running edition is 30th (2007)
40
Multi-Level Controlled Vocabulary Types
Principles of Heading Construction
1) Fundamental principles of LCSH
2) Structure of subject headings
3) Generation of headings and cross references
4) Main headings
5) Subdivisions
6) Pre-coordination and synthesis
7) Term relationships
8) Scope Notes
9) Class numbers
41
Multi-Level Controlled Vocabulary Types
Fundamental Principle of LCSH
1) The fundamental principles guiding the development of the Library of Congress subject
headings system are effective responses to
a) User needs
b) Uniform heading (one heading per subject - control of synonyms)
c) Unique heading (one subject per heading control of homographs)
d) Specific and direct entry
e) Stability
f) Consistency
42
Multi-Level Controlled Vocabulary Types
2) Structure of Subject headings
A. Single Concept headings
----------------------------------------------------------
Automobiles
Botany
Budget deficits
Electric
Interference
Boards of trade
Clerks of court
Structure of Subject headings
B. Pre-coordinated multiple-concept headings
------------------------------------------------------------
Budget in business
Church and industry
Earth-Rotation
Biology-Scholarships,
fellowships,
43
Multi-Level Controlled Vocabulary Types
3) Main headings
The main heading is that part of the
subject heading string which represents the
main concept without subdivision. Main
headings may be categorized according to
their functions: topical headings, form
headings, and different kinds of proper
name headings.
Types and functions of main headings
a) Topical Headings
Topical heading represents a concept or object
treated in a bibliographic item. It reflects what the
item is about. Examples
Economy
German language
Soldiers as artists
b) Form Headings
A form heading reflects the form of the material
There are various forms of reading material in
the library. For e.g. (a) Bibliographic Form (b)
Artistic and Literary form
44
Multi-Level Controlled Vocabulary Types
Syntax
Topical and form headings:
All main headings consist of single nouns or
noun equivalents. Noun equivalents may be in
the form of adjectives or gerunds or in the form
of adjectival phrases, conjunctive phrases, or
prepositional phrases. Qualifiers are added to
headings when necessary.
A. Single Noun Headings
B. Phrase Headings
A. Single Noun Headings
(a) Many topical and form headings consist of a
single noun or a noun equivalent in the form
of a single adjective or gerund.
(b) Nouns representing concrete objects are
normally in the plural form, and nouns
representing abstract concepts appear in the
singular. Examples Enzymes
Philosophies
Deaf
Running
Art
Agriculture
Education
Religion
45
Multi-Level Controlled Vocabulary Types
B. Phrase Headings:
Some concepts that involve two areas of knowledge can be expressed only by more or less
complex phrases. Example
Bible as literature
Freedom of information
There are various types of Phrase Headings which are as follows:
1) Adjectival headings
2) Conjunctive phrase headings
3) Prepositional phrase headings
4) Inverted phrase headings
46
Multi-Level Controlled Vocabulary Types
1) Adjectival Headings Examples
Computer architecture
Social classes
2) Conjunctive phrase Headings Examples
Children and politics
Boats and boating
3) Prepositional phrase headings Examples
Directors of Corporations
Doctor of philosophy degree
Proposal writing in educational research
4) Inverted phrase Headings Examples
Children’s literature, Canadian
Education, Higher
Taxation, Exemption from
47
Multi-Level Controlled Vocabulary Types
4) Subdivisions
Subdivisions are extensions of the main heading. They normally represent aspects of the main heading.
1) Topical subdivisions
2) Form subdivisions
3) Geographic subdivisions
4) Chronological subdivisions
Lets explain one by one from the next slide. Cont…
48
Multi-Level Controlled Vocabulary Types
1. Topical Subdivisions
Intellectual life
Marketing
Religious aspects
2. Form Subdivisions
Bibliography
Periodicals
Poetry
Tables
3. Geographic Subdivisions
Geographic subdivision again divided in to two parts namely
1) Direct Subdivision
Example
Music-Japan
Music-California
2) Indirect Subdivision
Example
Music-France-Paris
Music-Ontario-Toronto
49
Multi-Level Controlled Vocabulary Types
4. Chronological subdivision
Art, Modern-20th century
India-History-1800-1899 (19th century)
France-History-Revolution, 1797-1802
Lebanon-History-1982-1984
Poland-Economic conditions-1945-
United States-History-1945-1953
Bermuda Islands-Description andtravel-1979-
50
Multi-Level Controlled Vocabulary Types
5) Pre-coordination and synthesis
A heading may contain a single concept or a
combination of multiple concepts. The
combination may be formed when the
heading is being established or when it is
assigned to a particular bibliographic item.
A. Multiple-concept main headings
B. Headings with subdivisions
A. Multiple-concept main headings
Children and politics
Electricity in art
Religious education of teenage boys
B. Headings with Subdivisions
Birth control-Moral and ethical aspects
Cinematography-Electronic equipment
Philosophy, Ancient-Oriental
51
Multi-Level Controlled Vocabulary Types
6) Term Relationship
There are four types of term relationships in LCSH
1) Equivalence relationships
2) Hierarchical relationships
3) Associative relationships
4) General and Specific references
USE references are made from unauthorized or non preferred terms to authorized or valid headings. Reciprocals, in the form of UF (Used-for) references, are made under the valid headings.
Example
Business intelligence
UF Business espionage
UF Corporate intelligence
UF Espionage, Business
UF Espionage, Industrial
UF Industrial espionage
52
Multi-Level Controlled Vocabulary Types
2) Hierarchical relationships Example:
----------------------------------------------------------
Broader Terms (BT)
Apes
BT Primates
Ethnology
BT Anthropology
Novell
BT Fiction
Hydrogen as fuel
BT Fuel
2) Hierarchical relationships Example:
----------------------------------------------------------
Narrower terms (NT)
Fuel
NT Hydrogen as fuel
Literature
NT Fiction
Fiction
NT Novell
53
Multi-Level Controlled Vocabulary Types
3) Associative relationships Example
Ships
RT Boats and boating
Birds
RT Ornithology
Medicine
RT Physicians
General and specific references Example
General USE references General SA (See also)
references
1) Cards, Playing
USE Card games
1) Flowers
SA names of flowers,
e.g. Roses; to be
added as needed
2) Playing cards
USE Card games
2) Card games
See also
Card tricks
Gambling
Tarot
54
Multi-Level Controlled Vocabulary Types
7) Subject Heading for Special Materials in LCSH.
The assignment of subject heading for
audiovisual and special instructional materials
should follow the same principles that are
applied to books. The heading most
specifically describing the contents of the
material should be used. And the same
headings should be applied to book and non
book material alike.
Example
American poetry-Periodicals
Tuberculosis-Statistics-Periodicals
Jesus Christ-Travel-Palestine-Maps-To 1800
Teleki, Samuel, grof, 1845-1916- Journeys-Maps
Accounting-Periodical
55
Multi-Level Controlled Vocabulary Types
8) Scope Notes
Notes are provided under some headings in
order to define the scope, to explain the
relationships among headings, and to assist in
the proper application of the headings so that
consistency in assigning headings to documents
on like subjects may be achieved.
A) Definitions
B) Relations to other headings
C) Instructions, explanations
9) Class Numbers
A Library of Congress Classification number is
added to a heading if the caption for the number
is identical or nearly identical in scope,
meaning, and language to the subject heading,
or if the topic is explicitly mentioned in an
"Including" note under the caption for the
number. Multiple class numbers may be added
to a heading when the subject is treated from
more than one perspective. For the heading of a
subject covered by a span of class numbers, the
full span of pertinent class numbers is included.
56
Multi-Level Controlled Vocabulary Types
Criticism
Its disadvantage is that LCSH is American biased .
The words that are used in it are the words that are popular in American dialect and which are not popular to Indian conditions.
Many discrepancies regarding the subject headings can also be seen in its like
Labor-Labour
Color-Colour
Elevators-Lifts
57
MULTI- LEVEL CONTROLLED VOCABULARY
SEARS LIST OF SUBJECT HEADING
Multi-Level Controlled Vocabulary Types Revolution in Sears since its inception
The Sears List of Subject Headings (popularly called the Sears List) is a known tool for assigning standardized subject headings to all types of documents in a general small libraries having up to 20,000 titles in all subjects.
Sears List of Subject Headings was first designed in 1923 by Minnie Earl Sears (1873-1933) and has been continuing with her name.
It was designed with the objective of small libraries for simple and broader subject headings.
The first edition contained only 3200 preferred headings.
The 2nd (1926) and 3rd (1933) editions were again edited by her.
59
Multi-Level Controlled Vocabulary Types
From fourth to fourteenth came in to exist in between
(1939/1991) with addition of new word, modernize the
terminology of old ones and so on. But the format
continued the same with some new features such as the
addition of Abridged DDC numbers.
The orientation to the online environment started with
the 13th edition (1986)
15th edition is considered to be a innovative era of Sears.
The latest knowledge from information science and
information seeking behavior has been deployed to
modernize the internal structure and grammar of the
sears List
16(1997), 17(2000), 18th (2004), 19th(2007) , 20 (2010)
current edition is 21st (2014)
There are some changes has been seen in 15th edition
like adoption of thesaurus format by using abbreviations,
i.e. NT, BT, RT, USE and SA instead of X,XX etc. It
gave every page a new look.
Joseph Miler edited till 19th Edition but in later it was
edited with the assistance of some associated editor.
Published by HW Wilson
60
Multi-Level Controlled Vocabulary Types Changes in and a Brief Review of the 19th , 20th and 21st edition.
Changes in 19th Edition
Changes in 20th editions
Changes in 21st editions
1. In 19th ed there are about 440 new subject
headings are in the area of computers, IT
politics, popular culture and psychology
has been added.
1. The major features of 20th edition is
the inclusion of more than 300 new
subject headings
1. There is four year gap between Sears 20th
and 21st . Between this period more than
250 subject heading were added. For
example Cloud Computing, Massive
online open course, Paralympics games.
2. Totally number of preferred headings
is likely to the tune of 8000.
2. New headings in the area of ecology and
environment. Such as Rain forest ecology,
Grassland ecology, climate change and
sustainable agriculture.
2. Some headings also been changed like
“Internet Forum” rather than “Computer
bulletin boards”
3. Islamic religion/culture was popularized
after 9/11 attacks. All the US schools have
introduced curricula on Islamic religion
and culture. Since than it was one of the
area and add to SERES list.
3. New trends in Social Networking
represented with new headings such
as Twitter (Web site) and Face book
(web site)
3. A list of 34 cancelled and replaced
heading can be found on page Xlix.
61
Multi-Level Controlled Vocabulary Types
4. Besides this there are some
other new headings are also
added like Reality Shows,
Suicide bomber s, Stem cell
research etc.
4. A number of new headings for arts and craft have been
established such as Acrylics Painting and Wire Craft
4. There is a change in some headings to
conform to the terminology in RDA, for
example “Bible , New Testament” in
place of “Bible NT”
5. Some other changes, for e.g.
Biological Diversity becomes
Biodiversity, Native people has
been replaced by Indigenous
peoples.
5. The most significant revision in this edition deals with
subject headings relating to Russia and India. Where
material on Russia were formerly represented among
three headings, Russia, Soviet Union and Russia
(Federation). There is now single heading simple
Russia.
5. Rules for filling the entries have changed
in the 21st edition. Main heading are filed
alphabetically.
6. Fictitious Character become
Fictional Character.
7. Principles has been expanded a
bit to formulate headings in
some areas, namely, Native
Americans, Government Policy,
and mythology and folklore
6. In this edition, the heading Indians has been re-
established to denote the people of India replacing East
India and Number of heading relating to the literature
and culture of India have been similarly established, such
as Indian literature and Indian Music replacing by
Indic literature.
62
Multi-Level Controlled Vocabulary Types
Principles of the SEARS List
Principles of the SEARS list of subject
heading has always been based on the
principles of the Library of Congress Subject
headings. The principles are:
1) Direct and Specific entry
2) Common usage
3) Uniformity and consistency
Lets explain one by one
1) Specific entry: It means that subject
should be entered under its most specific
heading, not under the class to which it
belongs. For example,
Rose should be entered under “Rose”/
“Lotus” not under “flowers”.
Penguin is entered “Penguins” not under
Birds or even not under water Birds
63
Multi-Level Controlled Vocabulary Types
2) Direct Entry: It means that specific headings
should be entered directly as the lead point,
instead of a subdivision.
For example, Penguin is entered Penguin
instead of water birds-penguin
“Barbie doll” instead of Dolls-Barbie doll.
Roses instead of Flowers-Roses dolls.
3) Common usage: If the word is more than
one spelling, then the most popular one
chosen for common usage.
For example, there are some scientific name
are there like “Ornithology” for Birds.
Instead of “Ornithology” use Birds.
Instead of “Banquets” use Dinner.
4) Uniformity: Uniformity and consistency is
essential for maintaining standard.
64
Multi-Level Controlled Vocabulary Types
Structure of the Sears List
Sears list is an alphabetical [arranged word by
word according to ALA Filling Rules (1980)]
Introductory part including the list of about
500 (common) subdivisions. It describes the
brief history and principles of the Sears List.
List of subject headings in alphabetical order
given in two columns on every page.
List of SHs
Core of the system is the word by word alphabetical list of SHs. All the headings are of two types:
1) Non-preferred headings
2) Preferred headings
3) Subdivisions are used to subdivide a preferred heading.
65
Multi-Level Controlled Vocabulary Types
1) Non preferred headings
These headings are those which are not to be
used. Such headings are given in light type face
print. Each such non-preferred heading is
invariably preceded by given a lead “USE”
directing us to the preferred heading, e.g.
Cyclopedias
Use Encyclopedias and Dictionaries
Cyclotron
Use Cyclotrons
Cytology
Use Cells
Dairy farming
Use Dairying
2) Preferred headings
Preferred headings are authorized term
represented in a bold face for use against
the document content. Again these
headings are broadly categorized e.g. ideas,
objects, places, processes and relationships
including DDC class number, scope notes,
instruction for its further subdivision.
The concepts and relations of words are
mention by NT, BT and RT Cont…
66
Multi-Level Controlled Vocabulary Types
For Example
Dairying
UF Dairies
Dairy farming
Dairy industry
BT Agriculture
Livestock industry
NT Dairy cattle
Milk
Explanation
The three terms namely Dairies, Dairy farming and
Dairy industry given against the abbreviation UF
(Used for) are non preferred (synonyms) of the
heading used. These are equivalent to Dairying in
meaning. For this, the cataloguers will have to make
see references from these terms to the entry terms,
that is, from the terms not-used to the term used e.g. :
Dairies see Dairying
Dairy farming see Dairying
Dairy industry see Dairying Cont…
67
Multi-Level Controlled Vocabulary Types
BT means (hierarchically) Broader Term. Its
practical implication is to prepare “see also” entry
from broader to narrower term above :
Agriculture
see also Dairying
Livestock industry
see also Dairying
NT means Narrower Term. For this we have to
make see also references from broader to narrower
terms :
Dairying
see also Dairy cattle; Milk
RT means related terms. These are the terms at equal
level of hierarchy but are related with the entry in some
way. Its practical implication is to prepare see also
entries on reciprocal basis. For example in the entry
Diagnosis 616.07
- - - - - - - - - - - -
- - - - - - - - - - -
- - - - - - - - - - - -
RT Pathology
So we will prepare the following two entries for the RT:
Diagnosis
see also Pathology
Pathology
see also Diagnosis
68
Multi-Level Controlled Vocabulary Types
Subdivisions
Preferred headings are of two types:
Sometimes preferred SHs are used as direct headings as
well as subdivisions to other headings. For example,
Directories is a heading and also used as a subdivision,
e.g.
Directories - History
Mumbai – Directories
Colleges and universities – Directories
Physicians – Directories
Key Headings
In Sears some models are given to coin subject headings.
On the basis of these models an analogue headings can
ne coined. Models are:
Category Model heading
Author : Shakespeare, William, 1564-1616
Country : United States
State : Ohio
City : Chicago (Ill)
Language : English language
Literature : English literature
Ethnic : Native American
Public figures : President United States
Wars : World war, 1939-1945 Cont…
69
Multi-Level Controlled Vocabulary Types
It means, if we have a subject pertaining to any
country we will look under the United States for a
similar SH for that country, and then adapt the
heading accordingly. If our subject is
Geography of India
We will look under the United States where an
analogues headings is
United States- Geography
So SH will be
India - Geography
Similarly, for Gazetteer of Haryana
In this case we will look under Ohio, and adapt the
heading:
Haryana – Gazetteer
For historical buildings of Delhi, we will look
under Chicago, and form the following heading:
Historic buildings – Delhi
70
Multi-Level Controlled Vocabulary Types
For a book on “Style of Shri Prem Chand” we will look under
Shakespeare, William to get the following SH:
Prem Chand, 1880-1940 – Technique
Similarly Hindi Grammar will get the heading:
Hindi language – Grammar
Sanskrit Ucharan will get the SH
Sanskrit language – Pronunciation
Subdivisions
Subdivisions are a mean to make a heading more
specific and to make class of headings smaller. As
said earlier some headings are both a SH as well as a
subdivisions. There are four types subdivisions:
1) Topical: Birds—Eggs
2) Bibliographical: Sindhi language – Dictionaries
3) Geographical: Trees – India
4) Chronological: India -- History—1857-1947
5) English literature –21st century
71
Multi-Level Controlled Vocabulary Types
Criticism
It may be noted that the Sears list is designed
for American, Christians and Western
Culture.
It also uses American 45 Sears List of Subject
Headings spellings, for example, “catalog”
instead of “catalogue”.
It has headings which have no relevance in
India at the moment, e.g., Only child
[families], Unmarried fathers, Teenage
fathers.
Sometimes it looks too specialized for a small
library, e.g., Napkin folding. On the other
hand it does not have headings for Asian
subjects and concepts.
There is no heading specific to Caste systems,
Gherao, Mosques, Honour killings, Mehr,
Jihad, Child marriage, Dowry deaths, etc.
72
MULTI-LEVEL CONTROLLED VOCABULARY
Taxonomy
73
Multi-Level Controlled Vocabulary Types
What is Taxonomy?
A taxonomy is a classification system. Normally, the aim of a taxonomy is to group things
according to similarities in some respect such as similarities in structure, role, behavior, etc
The word taxonomy comes from the Greek taxis, meaning arrangement or order, and nomos, meaning law or
science. In the broader sense, a taxonomy may also be referred to as a knowledge organization system or
knowledge organization structure.
In other words A taxonomy is an orderly classification for a defined domain. It is an organizational structure in
which metadata values are grouped according to subject specific description, which is a set of characteristics
that each member of class exhibits.
Taxonomies begins with the broadest of classes and continue to narrow until the final class is reached. For
example:
Cont…
74
Multi-Level Controlled Vocabulary Types
For example, if we took the example
animal controlled vocabulary Cat is a
broader term for Manx, that Dog is a
broader term for Collie and Bulldog,
and that Mammal is a broader term for
Dog and Cat, we'd have a simple
taxonomy. The "broader" relationships
of a taxonomy are often represented
visually as a tree:
75
Multi-Level Controlled Vocabulary Types
While this is certainly not enough information for
a computer to understand what a collie is, a system
can use this little bit of semantics about collies to
add value to a data collection so that you can get
more out of it. For example, let's say Essess
Publications employee Mr.Ansari stores a picture
of Lassie in Essess Digital Asset Management
system and tags it as "Collie." Several months later
her co-worker Arun needs a picture of a dog for an
article about bringing pets to hotels.
Lassie
76
Multi-Level Controlled Vocabulary Types
He searches the DAM for "Dog," and although the picture of Lassie is not tagged with this
term, a search engine that's aware of the taxonomy metadata knows that, as a collie, the picture
of Lassie is also a picture of a dog, and returns that picture to Arun. The metadata helped Arun
to more quickly get value out of one of their information assets.
A large e-commerce website's menu system is often a taxonomy of their products. If this
taxonomy is designed well, customers can find what they need easily; if not, a customer may
give up and go to a competitor's website, so a well-designed taxonomy can have a direct effect
on a company's revenue. Cont…
77
Multi-Level Controlled Vocabulary Types
classification Subdivisions For any category, each
subcategory is a taxonomy
78
RELATIONAL CONTROLLED VOCABULARY
Thesaurus
79
Relational Controlled Vocabulary Types
What is Thesaurus?
The word thesaurus derived form the Greek
word ‘treasury’ which means a store house of
knowledge. A thesaurus is a work that
contains synonymous and sometimes
antonymous in contrast to a dictionary, which
contains definition and pronunciation. It is a
list of term arrange according to their
relationship of ideas.
80
Relational Controlled Vocabulary Types
Purpose/Functions of Thesaurus
To provide a map for a given field of knowledge
indicating how the concepts of ideas are related to
each other, which helps to indexer and searcher to
understand the structure of the field of knowledge.
To provide a standard vocabulary for a given subject.
It provides consistent representation of the subject
matter avoiding subject dispersion in out put and
input by controlling synonymous, quasi-synonymous
and by differentiation of homograph.
Bringing together the term which are semantically
related .
To limit the number of term that assign to be a
document.
To serve as search aid in retrieval.
81
Relational Controlled Vocabulary Types
Relationship among the terms
The Equivalence Relationship
The hierarchical (or whole-part) relationship and
The associative relationship.
Lets Explain with example:
Equivalence relationship has again divided in to three parts which are as followes:
Equivalence Relationship
Synonymous
Lexical variant
Quasi Synonymous
82
Relational Controlled Vocabulary Types
Equivalence Relationship: its denotes the
relationship between preferred term and non-
preferred term where two or more terms are
regarded for indexing purpose. This is
denoted by USE and UF. This general
relationship covers three kinds of terms.
Synonymous: A terms whose meaning can
be regarded as same in the wide range of
context, so that they are virtually
interchangeable. For example
Popular Name Scientific Names
Spider Arachnid
Standard Name Slang
High fidelity equipment Hi fi equipment
83
Relational Controlled Vocabulary Types
Lexical variant: A lexicon variants which
are different words for the same expression,
such as spelling, grammatical variants and
abbreviated forms. For example:
Spelling Color (Colour), Catalog (Catalogue) Abbreviation TQM (Total Quality management)
Quasi Synonymous: The words which are
not synonymous but near to synonymous.
For example: Urban areas (Cities)
Hierarchical Relationship: It express the
level of super ordination, subordination or
relationship among the term. BS5723
identifies three relational situations
representing hierarchical relationship.
Hierarchical Relationship
Generic Relationship
The Hierarchical whole part
Relationship
Instance Relationship
84
Relational Controlled Vocabulary Types
Generic Relationship: Which identifies the link between the class and category. For example
Teachers BT
Adult Teachers
School Teacher
Hierarchy whole part Relationship: It is a narrower subject field within a subject. For Example. System and Organ of the body.
Ear (BT)
External ear (NT)
Instance Relationship: In this case relationship
is determined by common noun then proper
noun. For example
Sea (BT)
Arabian Sea (NT)
Non-Hierarchical Relationship: In non-
hierarchical, relationship between the terms are
clearly related to each other conceptually but no
hierarchically. For Example Library (BT)
Librarians, Users, Documents (RT)
85
RELATIONAL CONTROLLED VOCABULARY
Ontology
86
Relational Controlled Vocabulary Types
Semantic Web?
A set of standards and best practices for sharing data and the semantic of data over the web for use by
Application.
[A set of Standards]
the RDF data Model
the RDF Schema and OWL standards for storing Vocabulary and Ontology
[best practices for sharing data over the web... or use by applications ]
These best practices recommend:
the use of URIs to name things
the use of standards such as RDF
they provide excellent guidelines for the creation of an infrastructure for the semantic web and semantic web of that
data
[sh98003588#concept]
http://id.loc.gov//authorities/sh98003588#concepts
87
Relational Controlled Vocabulary Types
Resource Description Framework (How it works)
Manir Sells Books
Resource: It is anything that has identify. For e.g
Manir, Book.
Now how does something get identity?
be identified by URI (Uniform Resource Identifier)
A govt., agency, a human an abstract concept.
Description: it is a container holding several
statements describing the resources.
Ask a friend (or computer) to describe Manir
One statement might be: Manir Sells books.
Framework is needed to enable humans and
machines to make and understand statement.
RDF Triplet
Here is RDF in pictorial form
All these are resources identified by Unique
URIs
Statement built from triplet Cont…
Object Subject Predicate
A sentence
Manir Sells Books
88
Relational Controlled Vocabulary Types
RDF using XML code the data into a machine red able format
<rdf: Description about= “[Manir]”
Xmlns: sells = “[NS]”>
<buy:myPredicate rdf:resources=“[book]”/>
</rdf:Description>
Human brain use Logic
Marry is a mother
A mother is a parent
Therefore Mary must be a parent
Now the question is that how the computer is understand
The combination of an RDF model and the associated XML gives the computer enough information to discover the meaning of data. Data about other data is often called metadata. XML and RDS deals with metadata, that is they deal with the description of the information available on the web. But if the machine are expected to interact with each other or share data in the true sense of the word, then semantic interoperability is essential. For this, a formal specification is required to explicit define various terms and their relationship. Ontology was thus developed in AI to facility knowledge sharing and reuse, and can be built using XML and RDF.
89
Relational Controlled Vocabulary Types
Emergence of Ontology
The Information and Communication technology tools like internet, www have change the information scenario and present the information structure in multidimensional way like e-book, e-journal, digital object. There range is vary from a single webpage to a pixel-based photograph to a digital piece of music.
Now it has been observed that tools for bibliographic control of the print era would not be adequate to handle the digital materials. For instance, though MARC21 bibliographic descriptions format exist but it has also certain limitation. It can not handle many aspects of digital resource like multiple date of creation and revisions, credit assignment etc.
New tools therefore been developed for this purpose. For e.g. Metadata, Ontology, Taxonomy. Link
90
Relational Controlled Vocabulary Types
Ontology as Central Concept in
Philosophy
What does ontology mean, why we use it?
People can not share knowledge if they do
not speak a common language.
What does it mean to speak a common
language?
To speak a common language
Common symbols and concept (Syntax)
Agreement about their meaning (Semantic)
Classification of Concepts (Taxonomy)
Associations and relations of Concepts
(Thesaurus)
Rules and knowledge about which relations
are allowed and make sense. (Ontology)
What is Ontology?
The study of being or existence
Describes the basic categories and
relationships of being to define entities and
types of entities.
91
Relational Controlled Vocabulary Types
Ontology in Computer Science.
Thomas R Gruber; “An ontology is an
explicit formal specification of a shared
conceptualization . The term is borrowed
from philosophy, where an ontology is a
systematic account of existence. For AI
system what “exists” is that which can be
represented”
Conceptualization: An abstract model
(domain, identified relevant concepts,
relations)
Model of domain inside that we try to identified its
relation
Explicit: meaning of all concepts must be
defined.
Formal: machine understandable (interpreted
it correctly)
Shared: Consensus about ontology
92
Relational Controlled Vocabulary Types
Example
93
Relational Controlled Vocabulary Types
When you developed an ontology, you can define your own relationships and attributes as well as classes of things that are categorized as well as classes of things that are categorized by these relationship and attributes.
Ontology can be used by software system to infer new information, such as class membership. For e.g. If Jack has Play instrument property value of “guitar” and the ontology says that anyone with a play instrument value is a musician. We can infer that Jack is a musician even if there is no explicit data saying that he is a member of that class.
Adoption in Library and Information Science:
B.C Vickery first drew attention to the concept of ontology for organizing knowledge in the wake of its increasing complexity. Ontology in the field of information management basically defines a common vocabulary for users who need to share information in a domain. The distinguish feature is that it includes machine-interpretable definitions of basic concepts and relation among them. Role of ontology there will be in building technologies, standard and tools to create information resources on the web in such a way that computer software can read and process information from those documents easily for search and retrieve on a global scale. For e.g. Digital Library and Ontology Library
94
RELATIONAL CONTROLLED VOCABULARY
Semantic Network
95
Relational Controlled Vocabulary Types
What is Semantic Network?
Semantic networks can be thought of as super-
thesauri. Each network can be represented in a
directed graph of concept nodes connected by
relations as well as some additional relations such
as whole-part, cause-effect, or parent-child
relationships.
A finite list of relations used in semantic networks
does not exist. Semantic network relations can
extend to provenance.
The record of how a particular value or record
came to be. Provenance can include things like
when, by whom, and how the item was created
or modified.
How it works?
Information technology experts tend to use semantic
networks to establish complex search interfaces, which
can help a user locate the most appropriate results based
on the search term. Since semantic networks describe
complex relationships, the search interface can be
programmed to interpret the user entry into various
nodes, which are included in a semantic network. The
resulting search is more exhaustive than that provided by
a multi-level set of values, because the system can be set
up to return results from different levels or categories
based upon relations.
96
Relational Controlled Vocabulary Types
Example
This very simple diagram of a
semantic network illustrates the
directed nature of relationships. For
example, using this diagram, you
can make the statement "A fish is
an animal that lives in the water."
Or, "A bear is a mammal (a type of
animal with a vertebra) that has
fur."
97
PART THREE CONTROLLED VOCABULARY
VS NATURAL LANGUAGE
Controlled Vocabulary Vs Natural Language
There are three types of Indexing Language
1) Controlled indexing language: Only authorized term can be used by the indexer to described
the documents.
2) Natural Language: Any term that appears in the title, abstract or text of a document record may
be index term
3) Free Indexing Language: Any term (not only form the document) can be sued to describe the
document
99
Controlled Vocabulary Vs Natural Language
Controlled Indexing
1) When indexing a document, the indexer also has to choose the level of indexing exhaustively. For e.g. using law indexing exhaustively, minor aspects of works will not be describes with index term.
2) CVs are often claimed to improve the accuracy of free text searching, such s reduce irrelevant items in relevant list
3) A controlled vocabulary can dramatically increase the performance o an information retrieval system, if the performance is measured by precision.
Natural Indexing
1) Free text search involves using natural language indexing. With an indexing exhaustively set to maximum (every word in the text is indexed)
2) Retrieving irrelevant items may create confusion.
3) Less precision.
100
Controlled Vocabulary Vs Natural Language
4) In some cases, controlled vocabulary can enhance the re-call s well, because unlike natural language schemes, Once the correct authorized term is searched, you do not need to worry about searching for other terms that might be synonym of that term
5) Controlled vocabulary search may also leads to unsatisfactory recall, in that it will fail to retrieve some documents, that are actually relevant to the search question.
6) Not immediately up to date.
7) Artificial language has to be learnt by the searcher.
8) High input cost.
4) In Natural language exact matching is required to increase recall.
5) Where as all the text of an article are indexed.
6) Up to date. New terms are immediately available.
7) Natural Language word used by indexer as well as searcher.
8) Low input cost etc
101
Conclusion
Vocabulary Control is used to improve the effectiveness of information storage and retrieval system;
web navigation system and other environments that seek identity and locate derived content via some
sort of description using language. Controlled vocabulary is like a fishing net which helps the user to
fetch the accurate information from the mountain of knowledge. It has parallel advantages from both
the staff as well as user point of view. Electronic publishing has exploit over the print publishing.
Academia moving towards in electronic publishing and service providers are deeply involved in
gathering information at one place for their clients. Vocabulary Control has increased the level of
efficiency while retrieving information by establishing relation among the content of document. It gives
the power to user community to reach as to their desired level of information. Absence of controlled
vocabulary system in e-content management as well as library management is like a hawk-eyed man
walking in a dark night.
102
References (MLA7)
1. "What Is a Controlled Vocabulary?" What Is a Controlled Vocabulary? | Marine Metadata Interoperability. N.p., n.d. Web. 01 Dec. 2016.
2. Leise Fred. "Controlled Vocabularies: An Introduction." Latest TOC RSS. Society of Indexers, n.d. Web. 01 Dec. 2016.
3. Gilchrist Alan. "Thesauri, Taxonomies and Ontologies - an Etymological Note." Journal of Documentation 59.1 (2002): 7-18. Web. 01 Dec. 2016.
4. Steckel, By Fred Leise Karl Fast and Mike. "What Is A Controlled Vocabulary?" Boxes and Arrows. N.p., 14 Nov. 2013. Web. 10 Dec. 2016.
5. Noruzi Alireza. "Folksonomies: (Un)Controlled Vocabulary?" Knowledge Organization 33.4 (2006): 199-203. Web.
6. Noy Natalya F., and McGuinness Deborah L. "Ontology Development 101: A Guide to Creating Your First Ontology." (n.d.): 1-24. Web. 01 Dec. 2016.
7. Slimani, Thabet. "Ontology Development: A Comparing Study on Tools, Languages and Formalisms." Indian Journal of Science and Technology 8.24 (2015): 1-12. Web. 01 Dec. 2016.
8. Giri, Kaushal. "Role of Ontology in Semantic Web." DESIDOC Journal of Library & Information Technology 31.2 (2011): 116-20. Web. 01 Dec. 2016.
9. "Recent Review of Sears List of Subject Headings." Rev. of Sears List of Subject Headings.Technicalities Mar.-Apr. 2015: 0-1. H.W Wilson. Web. 01 Dec. 2016.
10. Hedden, Heather. "Controlled Vocabularies, Thesauri, and Taxonomies." The Indexer 26.1 (n.d.): 33-34. Web. 01 Dec. 2016.
103
References
11. "What Is a Controlled Vocabulary?" What Is a Controlled Vocabulary? | Marine Metadata Interoperability. N.p., n.d. Web. 01 Dec. 2016. Website
12. https://www.creighton.edu/fileadmin/user/HSL/docs/ref/Searching_Databases_-_Controlled_Vocab_-_MeSH.pdf
13. http://www.topquadrant.com/docs/whitepapers/cvtaxthes.pdf Access Date 01-12-2016
14. http://www.getty.edu/research/publications/electronic_publications/intro_controlled_vocab/what.pdf Access Date 01/12/2016
15. http://books.infotoday.com/books/The-Accidental-Taxonomist/At-SampleChapter.pdf 01/12/2016
16. http://www.topquadrant.com/docs/whitepapers/cvtaxthes.pdf Access date 01/12/2016
17. https://www.youtube.com/watch?v=i1tSsWdygS8 Access Date 01/12/2016
Thank You
----------------------------------------------------------------------------------------------------------------------------- ---------
104
References back
105