retrieval effectiveness of tagging systems...social tagging is a widespread activity for indexing...
TRANSCRIPT
Retrieval Effectiveness of Tagging SystemsIsabella Peters, Laura Schumann, Jens Terliesner, Wolfgang G. Stock{isabella.peters | laura.schumann | jens.terliesner}@uni-duesseldorf.de
Heinrich-Heine-University, Department of Information ScienceUniversitatsstraße 1, 40225 Dusseldorf, Germany
Paper and fullreference list at:
http://tinyurl.com/retrieval-test
Abstract
Social tagging is a widespread activity for indexing user-generatedcontent on Web services. This poster summarizes research onfolksonomies and their retrieval effectiveness. A TREC-like retrievaltest was conducted with tags and resources from the socialbookmarking system delicious, which resulted in recall and precisionvalues for tag-only searches. Moreover, several experimental tag-based databases (i.e., Power tags, Luhn tags) have been testedregarding their retrieval effectiveness. Test results show thatfolksonomies work best with short queries although recall values arehigh and precision values are low. Here, a search function “Powertags only” greatly enhances precision values.
1. Extract documents and tags
I 1,989 resources consisting of the docsonomies of thedelicious-bookmarks
I collected from delicious.com in October 2010
I tags form the indexed databases for the retrieval test runs
I each resource is tagged from at least 30 users
I each resource was tagged with “folksonomy”,“seo” or“folksonomies”
2. Adjust tags
I original tags (Information Retrieval)
I tags unified, i.e. without special characters (informationretrieval)
I tags unified and stemmed (informationretriev)
Relevance
Queries Tags
All tags
Pow
er tags
Luh
n tags
4. Create information needs
55 information needs and search tasks have been created. They vary in theircomplexity as one can see in the following examples: simple lookup, complexlookup, exploratory search task.
I Find a thesaurus
I Find articles which present social bookmarking tools
I Find articles which advise the combination of folksonomies and controlledvocabularies for indexing
5. Judge relevance of documents
I 24 students and 9 people from staff of the department act as assessors
I assessments are binary and are conducted manually
I assessors had access to resources (i.e., websites). Tags were hidden forrelevance assessments
I if two assessors agreed in their decision the relevance judgment was savedimmediately
I this procedure results in 109,395 relevance judgments
6. Create queries
I 25 students and 2 people from staff created queries for each information need→ 730 queries
I Boolean operators and brackets are allowed (they can be used in delicious)
I minimum number of query terms: 1, maximum number of query terms: 13
I average query length: 3 terms
I experts from staff built queries for each search task with a maximum number of5 query terms (simulation of real-world-users: they use 2 terms or less)
3. Extract Power tags and Luhn tags
The algorithms developed for Power tag and Luhn tag extractionwork differently and depend on the tag distribution of thedocsonomy.
0
0,2
0,4
0,6
0,8
1
1,2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0,00
0,20
0,40
0,60
0,80
1,00
1,20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
7. Retrieval and evaluation
The figures A to F visualize the results of the retrieval test and display average recall values, average precision values and values for F-measure for conducted search runs. Figures A to C visualize the results of the expert searches, figures D to F showthe results of simple lookup-search tasks requesting one query term. Figures G and H visualize the relative benefit or loss while using Power tags and Luhn tags with expert queries (G) and one word queries (H).
Test results show that retrieval in folksonomies works best with very short queries. Here, a search function “Power tags only” greatly enhances retrieval effectiveness.
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Original tags 55 information needs
Expert queries
all delicious tags restricted to Luhn tags restricted to Luhn tags and Power
tags restricted to Power tags
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Unified tags 55 information needs
Expert queries
all delicious tags, unified restricted to Luhn tags, unified restricted to Luhn tags and Power
tags, unified
restricted to Power tags, unified
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge r
eca
ll fo
r d
atab
ase
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
me
an a
vera
ge p
reci
sio
n f
or
dat
abas
e
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Unified and stemmed tags 55 information needs
Expert queries
all delicious tags, unified and stemmed
restricted to Luhn tags, unified and stemmed
restricted to Luhn tags and Power tags, unified and stemmed
restricted to Power tags, unified and stemmed
Expert queries
0%
20%
40%
60%
80%
100%
120%
140%
160%
original tags original tags
restricted to
Luhn tags
original tags
restricted to
Power tags
unified tags unified tags
restricted to
Luhn tags
unified tags
restricted to
Power tags
unified and
stemmed
tags
unified and
stemmed
tags
restricted to
Luhn tags
unified and
stemmed
tags
restricted to
Power tags
Baseline = F-measure of delicious-tags for expert queries
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Original tags 5 information needs
One word query
all delicious tags restricted to Luhn tags restricted to Luhn tags and Power
tags restricted to Power tags
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Unified tags 5 information needs
One word query
all delicious tags, unified restricted to Luhn tags, unified restricted to Luhn tags and Power
tags, unified
restricted to Power tags, unified
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
re
call
for
dat
abas
e
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
ave
rage
pre
cisi
on
fo
r d
atab
ase
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
F-m
eas
ure
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Unified and stemmed tags 5 information needs
One word query
all delicious tags, unified and stemmed
restricted to Luhn tags, unified and stemmed
restricted to Luhn tags and Power tags, unified and stemmed
restricted to Power tags, unified and stemmed
0%
20%
40%
60%
80%
100%
120%
140%
160%
original tags original tags
restricted to
Luhn tags
original tags
restricted to
Power tags
unified tags unified tags
restricted to
Luhn tags
unified tags
restricted to
Power tags
unified and
stemmed
tags
unified and
stemmed
tags
restricted to
Luhn tags
unified and
stemmed
tags
restricted to
Power tags
Baseline = F-measure of delicious-tags for one word queries
74th Annual Meeting of the American Society for Information Science and Technology, October 9-13, 2011, New Orleans, Louisiana The research is funded by the DFG (STO 764/4-1). We thank students and staff for their valuable contribution.