the growing pains of a controlled vocabulary
DESCRIPTION
TRANSCRIPT
![Page 1: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/1.jpg)
1Karen Loasby 7 March 2005
The growing pains of a controlled vocabulary
![Page 2: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/2.jpg)
2Karen Loasby 7 March 2005
Introduction
• Karen Loasby• Information architect• Worked for BBC for 4 years on search,
navigation, metadata and content management projects
• 2 years previously for the Guardian newspaper archiving the paper and arranging content on the website
• MSc in Information Science from City University, London
![Page 3: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/3.jpg)
3Karen Loasby 7 March 2005
Agenda
• Background
• The problem
• Formal classification vs. Folk tags
• Our middle ground
• What happened
• Learning points
• Questions
![Page 4: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/4.jpg)
4Karen Loasby 7 March 2005
Background
• Content management project
• Regional websites
• Need for metadata
• Authors around the UK
![Page 5: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/5.jpg)
5Karen Loasby 7 March 2005
![Page 6: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/6.jpg)
6Karen Loasby 7 March 2005
Problem
• Faceted classification system
• Authors to tag
• Central control
• But …
• Journalists are the specialists – know the domain and the vocabulary.
![Page 7: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/7.jpg)
7Karen Loasby 7 March 2005
Formal classification
• Pre-determined terms
• Centralised control• Rich relationships
![Page 8: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/8.jpg)
8Karen Loasby 7 March 2005
Folk tags
• What it is then?• Folksonomy,
ethnoclassification, social classification, social categorisation and so on
![Page 9: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/9.jpg)
9Karen Loasby 7 March 2005
Comparing approaches
Formal• High maintenance• Consistent/predictable• Rich relationships• Can be artificial
Folk• Low maintenance• Quirky/surprising• Less added value• Real user language
![Page 10: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/10.jpg)
10Karen Loasby 7 March 2005
A role for both
• Where we are using folk tagging
• And where we won’t– Trust & Authority– High value to business– Missing motivation from users– Broad domain/user base– To avoid tryanny of minority
![Page 11: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/11.jpg)
11Karen Loasby 7 March 2005
An experimental middle ground
• Centralised control of terms
• But encouraging absorption of user language
• Higher maintenance than folk tags
• Cheaper than professional cataloguing
![Page 12: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/12.jpg)
12Karen Loasby 7 March 2005
BBC Experience
Semi-automatic classification
Terms suggested from the CVs
Terms are OK
The suggested terms do not describe
the content
Search or browse for terms
Send suggestion to the CV team
Terms are OK
Send suggestion to the CV team
CV team evaluatesuggestion Say no to the term
– change the classification on
the content object
Add to CV as a variant term
or preferred term
![Page 13: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/13.jpg)
13Karen Loasby 7 March 2005
Operational system
• 8000 requests in 10 months
• From 160 journalists– Average per user of 50 terms– However this varied wildly. Our top user has
suggested 476 terms
![Page 14: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/14.jpg)
14Karen Loasby 7 March 2005
Graph showing variationbetween teams
0
100
200
300
400
500
600
700
800cum
bria
tyne
cam
bridgeshire
guern
sey
leic
este
r
south
york
shire
wilt
shire
suffolk
liverp
ool
mancheste
r
berk
shire
bristo
l
kent
coventr
y
tees
jers
ey
sto
ke &
sta
ffs
nottin
gham
derb
y
hum
ber
som
ers
et
nort
ham
pto
nshire
norf
olk
beds, bucks &
hert
s
leeds
here
ford
& w
orc
s
birm
ingham
![Page 15: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/15.jpg)
15Karen Loasby 7 March 2005
Growth in the CVs
• Up 15000 terms in 10 months
• Most growth in person/proper names • People, venues and organisations• Up by 50% to 35,000
![Page 16: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/16.jpg)
16Karen Loasby 7 March 2005
Growth of facets
CV Requests By Month
0
1000
2000
3000
4000
5000
6000
7000
Month
Qu
an
tity Name
Location
Subject
BBC Brand
Time Period
![Page 17: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/17.jpg)
17Karen Loasby 7 March 2005
Types of terms
• Mostly good– Only 200 terms actually rejected
• Synonyms vs. entirely new terms– New for names (only 2% synonyms)– Synonyms for subject (15% synonyms)– Location – needed colloquial terms
![Page 18: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/18.jpg)
18Karen Loasby 7 March 2005
Resourcing
• Handling the requests from journalists
• First 3 months – one IA
• Subsequently 2 to 3 junior IAs
• Too much – how to reduce?
![Page 19: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/19.jpg)
19Karen Loasby 7 March 2005
Lessons learned
• Success with the journalists– They suggested terms!– Got the faceted classification – Began to suggest terms in “our” format – Some did engage at a detailed level
![Page 20: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/20.jpg)
20Karen Loasby 7 March 2005
Lessons Learnt
• Difficulties for journalists– System looks as if totally automatic as part of
a content management system– “Journalists are people too”
– Users struggling with a content object tagging system; rather than page based
![Page 21: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/21.jpg)
21Karen Loasby 7 March 2005
Example
Subject: Pregnancy
![Page 22: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/22.jpg)
22Karen Loasby 7 March 2005
Lessons Learnt• Difficulties for journalists, cont.
– They find it boring – Makes it harder for the aim of “finding and re-
use” to apply – Needed to do more pre-emptive work for them
![Page 23: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/23.jpg)
23Karen Loasby 7 March 2005
Lessons learnt
• Number of terms suggested depends on– Type of facet– Dynamism of content– Scope of the content– Enthusiasm of users
![Page 24: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/24.jpg)
24Karen Loasby 7 March 2005
Next?
• High value facets still need control– Make use of the metadata(!)– Sell the message– Federated management– Earlier in production
• And for folk tagging?
![Page 25: The growing pains of a controlled vocabulary](https://reader036.vdocuments.us/reader036/viewer/2022062510/54594e3db1af9f33608b5406/html5/thumbnails/25.jpg)
25Karen Loasby 7 March 2005
Thanks to the IA team for their analysis work;– Jon Carey– Adil Hussein– Christine Rimmer