+ user-induced links in collaborative tagging systems ching-man au yeung, nicholas gibbins, nigel...
TRANSCRIPT
+
User-induced Links in Collaborative Tagging Systems
Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09
Speaker: Nonhlanhla Shongwe
18 January 2009
+ 2
Preview
Introduction
Collaborative tagging
User-Induced hyperlinks Similarity of Assigned Tags Association Rule Mining
Analysis of User-induced links
Tag Prediction
Discussion
Conclusion
2
+ 3
Introduction
Hyper links Makes navigation through the web possible The author decides the document to link to
Due to the limited links that authors give, has lead to user-contributed content on the web.
In social bookmarking sites, e.g. Delicious Users can maintain a collection of documents URLs are identified by their chosen tags
3
+ 4
Collaborative tagging (1/2)
Popular Tagging systems e.g. Delicious and LibraryThing Allows users describe their favorite online resources using
their own words Eg http:///www.cnn.com tags new, tv, sports weather,
travel
Advantages over traditional methods Flexibility and freedom offered by these systems Systems are quick to adapt to changes in the vocabulary
among the users.
4
+ 5
Collaborative tagging (2/2)
Collaborative tagging activities of participating user results in scheme called folksonomy
Folksonomy is divided into three types of elements Users
Assign tags to the Web Tags
Keywords chosen by users to describe and categorize a web document
Documents Object tagged by the user
5
+ 6
User-Induced hyperlinks
Two types of hyperlinks For Navigation For recommendation
Directs users to other documents that contain related information
Two different approached to discover implicit relations in folksonomy Calculating the similarity between the sets of tags assigned to
the document Analyzing the collective behavior of the user who have tagged
the document
User-induced Links are implicit links in a folksonomy as resulted from collaborative tagging activities by users
6
+ 7
Similarity of Assigned Tags (1/4)
First approach of discovering user-induced links Calculate the pair-wise similarity between documents based
on their tags Jaccard Coefficient
In IR, Cosine Similarity
7
+ 8
Similarity of Assigned Tags (2/4)8
Cosine Similarity
+ 9
Similarity of Assigned Tags (3/4)9
Second similarity function The normalized discounted cumulative gain (NDCG)
used to evaluate ranking of documents according to their relevance score Firstly list the tags of the two documents
Secondly, calculate the DCG at position p
+ 10
Similarity of Assigned Tags (4/4)10
Thirdly, iDCG
Finally, calculate the NDCG
Use a function
+ 11
Association Rule Mining11
Second approach of discovering user-induced links Finding out pairs of Web documents that have both been
tagged by the same group of users Aims at identifying implicit patterns within a large database
of transactions Two major concepts
Support
confidence
+ 12
Analysis of User-Induced Links (1/3)12
Two methods described Identify user-induced links in data collected Delicious Compared them with existing hyperlinks in terms of several
different aspects.
Several aspects to compare Do they connect 2 documents from the same
domain/website Similarity between documents on the two ends of a link Whether users are equally interested in the linked
documents
+ 13
Analysis of User-Induced Links (2/3)13
Data collection Data collected from Delicious Documents cover a wide range of topics Documents collected on per-tag basis
First collected at random 130 tags, popular tags For each tag, crawl Delicious to obtain a set of documents
and users that have tag the document.
+ 14
Analysis of User-Induced Links (3/3)14
Results Identify user-induced links between the documents using
the two methods For similarity, vary the similarity threshold to 0.5 For association Rule, set minimum support to 100 and vary
the minimum confidence level
Findings Very few user-induced links that supported confidence of
0.5 and above
+ 15
Results (1/8)15
+ 16
Results (2/8)on Same Domain
16
One important function of hyperlinks allow users to navigate from one hypertext document to
another
More beneficial if the links point to some document outside external to the current website
Check whether the documents at the ends are from the same domain
+ 17
Results (3/8) on Same Domain
17
+ 18
Results (4/8) on Coincidence between existing hyperlinks and user-induced links
18
See whether such links already exist between the documents
If user-induced links coincide with existing hyperlinks means that users are satisfied with the existing hyperlinks
If user-induces are mostly new, means that there are user interests and perspectives that
existing hyperlinks have note captures
+ 19
Results (5/8) on Coincidence between existing hyperlinks and user-induced links
19
+ 20
Results (6/8)on similarity and user preferences
20
Look at documents that are connected by user-induce links Between blog posts of highly related topics News articles on the same topics Websites offering applications of similar functionalities Q&A pages of some portal site
Two different approaches for generating user-induced links Association rule, a link is generated if enough users are
interested in two documents regardless of the similarity between them
Similarity based, generates links based on the tags assigned regardless of whether there are many users interested in the documents
+ 21
Results (7/8) on similarity and user preferences
21
+ 22
Results (8/8) on similarity and user preferences
22
+ 23
Tags Prediction (1/3)23
The analysis of user-induced links shows that links generated by association rule mining of user collections usually connect documents that are highly related to each other as judged by the similarity between their tags
To predict the tags Identify the other documents that have a link to this
document
The set of documents that have a link (dx)
+ 24
Tags Prediction (2/3)24
Firstly, consider a simple averaging method
+ 25
Tags Prediction (3/3)25
Secondly method of aggregation method
+ 26
Experiments (1/2)26
Measure the performance of the predictions By using NDCG Precision at the nth Term
NDCG was used To investigate whether the predictions are accurate in
terms of the ordering of the tags.
+ 27
Experiments (2/2) 27
+ 28
Discussion28
Implicit relation between web documents can be discovered by examining user preferences and document similarity embedded in a folksonomy
User-induced are different from hyperlinks Collaborative tagging environment
shows the differences between the perspective of Web authors and Web readers
Worthwhile considering an open hypermedia structure backed by a collaborative tagging system.
+ 29
Conclusion29
User-induced links, a form of implicit relations between documents
We used Tag similarity
to generate many user-induced links Association rule miming
to generate very high user-induced-links
+
30
Thank you for your attention
30