1
Linking Topics of News and Blogs withWikipedia for Complementary Navigation
Yuki Sato† Daisuke Yokomoto†
Hiroyuki Nakasaki† Mariko Kawaba‡
Takehito Utsuro† Tomohiro Fukuhara††
†University of Tsukuba‡NTT Cyber Space Laboratories, NTT Corporation
††University of Tokyo
2
fundamental background
facts and knowledge
report precise facts
Events in the Real World
News
subjective information:
personal opinions and experiences
Blog
Wikipedia3 Information Sources
on the Web
3
fundamental background
facts and knowledge
report precise facts
Events in the Real World
News
subjective information:
personal opinions and experiences
Blog
Wikipedia3 Information Sources
on the Web
Purpose of the research:Linking Topics of News and Blogs with
Wikipedia
4
July 2009, Dragon Quest 9 (console role-playing game
for the Nintendo Entertainment System)was published in Japan.
6
fundamental background
facts and knowledge
report precise facts
News
subjective information:
personal opinions and experiences
Blog
Wikipedia3 Information Sources
on the Web
Events in the Real World
Dragon Quest 9 on sale
11
report precise facts
Overview of the Framework of
Complementary Navigation
News
subjective information:
personal opinions and experiences
Blog
Events in the Real World
fundamental background
facts and knowledge
Wikipedia
��������� ��������������������
���������
������� ����������� ���������������� ����������������
������ ���������� ������ �����
����Complementary
Navigation
ComplementaryNavigation
ComplementaryNavigation
12
Outline of the Talk• Purpose of the Work:
Linking Topics of News and Blogs with Wikipedia• Mock-up: from News to closely related Blog Posts• Details of the Proposed Method
– Ranking related Wikipedia Entries given a News Article
– Ranking related Bloggers/Blog Posts given an Index Wikipedia Entry
• Evaluation• Conclusion and Future Works
13
Outline of the Talk• Purpose of the Work:
Linking Topics of News and Blogs with Wikipedia• Mock-up: from News to closely related Blog Posts• Details of the Proposed Method
– Ranking related Wikipedia Entries given a News Article
– Ranking related Bloggers/Blog Posts given an Index Wikipedia Entry
• Evaluation• Conclusion and Future Works
14
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
News Article
Blog Search
Queue of guys who are waiting for the sales of
Dragon Quest 9.
15
Complementary Navigation from News to Blog: Our Approach
Wikipedia Entries as Conceptual Search Index
News Blog
Wikipedia
1. Search for Wikipedia Entries closely related to the given News Article � Use as Search Index
2. Search for Bloggers/Blog Posts closely related to the Index Wikipedia Entries
16
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
News Article
Blog Search
Queue of guys who are waiting for the sales of
Dragon Quest 9.
���
���� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
Related Topics
17
Complementary Navigation from News to Blog: Our Approach
Wikipedia Entries as Conceptual Search Index
News Blog
Wikipedia
1. Search for Wikipedia Entries closely related to the given News Article � Use as Search Index
18
���
���� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
Related Topics
§¨Blog
Search
News Article
19
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
News Article
Blog Search
Queue of guys who are waiting for the sales of
Dragon Quest 9.
���
���� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
Related Topics������� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
20
Selected Topic
©ª#«¬ ®)*-.0 12$34�HI¯°±²$!"³<´�l~rsDµ¶�L¯)*-.0fHI¶D>·SlLM¸¹º»¸GrsD¼��SA½¾µK�SlLst�M:::
©ª#«¬)*+,-./# 12$34�/-.;¿ÀÁkHI�¶�LÂ�M���DSÃ)*-. 12$34�MHI@GfÄÅ$�h>M
{¾qsÆ{Ç$EÈtlLÉÊDË>MMM:::Ì
Blog Post Ranking
1st
2nd
������� �����
21
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
News Article
Blog Search
Queue of guys who are waiting for the sales of
Dragon Quest 9.
���
���� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
Related Topics������� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
22
©ª#«¬ ÍÎ$ÏÐÑÒ�hÓPÔQ�Ô@fAYZÕÖÎ4$EjÍÎ>�ÉK¶ËM×@$YfØÙDAÚÛ$Î4@ÜG�4¶�LM:::
©ª#«¬ÁËÝÞ[\ßA�h$jÊà[\D�h>á�¼ß�âÞãA(ÁËÝÞ [\ß5MäßoåGfæG�h>á�~}~Açݼ¶DG»¸èéêf¶jM:::Ì
Blog Post Ranking (damaged)
1st
2nd
ëì�í�î�)
�h �����
Another Topic which damages Blog Post Ranking
Parade of a famous festival in Kyoto
Queue of a Noodle Restaurant in Shibuya
23
���
���� �����
�h �����
����� �����������������������
�� ������
)*+,-./#���� ���������� ��������
���I�������� ����� ����������
.;<-/������
)*+,-./# 12$34����������� ��¡�¢����£��� ���¤������� ���
�� !"# �������������� ����������¥ �����
[\ �¦������
���� !"#$%&'()*+,-./#0 12$34�56/-789:.;<-/=>??@ABCDEFGHIJKLMNOP??QHI$R'S TOPUVQW4MXYZ[\]$^_`abcd e^bedcdDf"g,>hijklLMR@$mn?opqSTrstuv$wx6yy=f()*- .fz'{|}~�4A������ �$DA�}G�L5���LM
News Article
Related Topics
Blog Search
40% 50% 60%10ï
30ï 10ï 0ï 0ï 10ï
0ï
Precision of Top 10 ranked Blog Posts per Topic
21% on the AverageImproved to 50% after Manually Selecting Relevant Topics
24
Outline of the Talk• Purpose of the Work:
Linking Topics of News and Blogs with Wikipedia• Mock-up: from News to closely related Blog Posts• Details of the Proposed Method
– Ranking related Wikipedia Entries given a News Article
– Ranking related Bloggers/Blog Posts given an Index Wikipedia Entry
• Evaluation• Conclusion and Future Works
25
Complementary Navigation from News to Blog: Our Approach
Wikipedia Entries as Conceptual Search Index
News Blog
Wikipedia
1. Search for Wikipedia Entries closely related to the given News Article � Use as Search Index
26
Ranking Wikipedia Entries with Topic Related Terms
Nintendo
Dragon Quest series
Game
ðððð
ðððð
Related Wikipedia Entries
Top ranked 10 Entries
News Article
Dragon Quest 9 on sale
27
Super Mario BrosñPokémonNintendo DSñWiiñGameñSony
Role-playing gameñFighting game
Extracting topic-related terms from Wikipedia
Wikipedia¬ describes background knowledge of the Topic
Extracted Related Terms
Types of topic-related terms1. Bold Text2. Redirect6paraphrase of the entry title=3. Title of each paragraph4. Noun phrase in body text
28
Ranking Wikipedia Entries with Topic Related Terms
Nintendo
Dragon Quest series
Game
ðððð
ðððð
Related Wikipedia Entries
Top ranked 10 Entries
News Article
Dragon Quest 9 on sale
type(t)¬ type of related term tweight(type(t))¬weight of the type type(t)
weight(type(t) = Redirect) = 1weight(type(t) = Bold text) = 1weight(type(t) = Title of each paragraph) = 1weight(type(t) = Noun phrase in body text) = 1
type(t)¬ type of related term tweight(type(t))¬weight of the type type(t)
weight(type(t) = Redirect) = 1weight(type(t) = Bold text) = 1weight(type(t) = Title of each paragraph) = 1weight(type(t) = Noun phrase in body text) = 1
� ��t
tfreqttypeweightneoreWikiNewsSc ))())(((),(
29
Complementary Navigation from News to Blog: Our Approach
Wikipedia Entries as Conceptual Search Index
News Blog
Wikipedia
2. Search for Bloggers/Blog Posts closely related to the Index Wikipedia Entries
30
Search for Bloggers/Blog Postsclosely related to the Index Wikipedia Entries
Nintendo
Dragon Quest series
Game
ðððð
ðððð
ðððð
Related Wikipedia Entries
Top ranked 10 Entries
Blog Post
Blog Post
Blog Post
Related Blog Posts
Blog Post
Blog Post
Blog Post
ðððð
1st
ranked
Ranking related Blog Posts
News Article
Dragon Quest 9 on sale
2nd
ranked
3rd
ranked
31
Topic-related blog feed (blogger) retrieval [Kawaba et. al, ICWSM2008; Nakasaki et. al, ICWSM2009]
Yahoo! Japan Search API
Usual search Re-ranked list
Hits of topic keyword
appeared in feed
List of blog feeds which have high hits of
topic keyword.
Blog feed¬A group of blog posts which are written by same blogger.
32
Selecting blog posts by topic-related terms
Blog Feed: Topic Keyword Frequency in the Blog Feed ò 10
Blog post: At least one Topic Related Term included in the Blog Post
Requirements
�� ó�ªGô¶l~A�� ó�ªV*�A�� ó�ª9)õ,/AÇ�~;,ö,)�DSG÷¼¶DAø>���$���� ùiúl~}¼(ûü5�{}Àݵ!"#MÇK>A(ýþ<#�,/©�5DËM2008P9Q13@6�=GAÇ$&õ���,$(´*��5>HIJK¶ËM
33
Ranking Blog Posts with Topic Related Terms
collected blog posts
Ranking blog posts withWikiBlogScore
Super Mario BrosñPokémonNintendo DSñWiiñGameñSony,
Role-playing gameñFighting game
ranked blog posts
“Nintendo”Related terms from Wikipedia
type(t)¬ type of related term tweight(type(t))¬weight of the type type(t)
weight(type(t) = Redirect) = 3weight(type(t) = Bold text) = 2weight(type(t) = Anchor text) = 0.5
type(t)¬ type of related term tweight(type(t))¬weight of the type type(t)
weight(type(t) = Redirect) = 3weight(type(t) = Bold text) = 2weight(type(t) = Anchor text) = 0.5
� ��t
tfreqttypeweightbeoreWikiBlogSc ))())(((),(Nintendo
34
Outline of the Talk• Purpose of the Work:
Linking Topics of News and Blogs with Wikipedia• Mock-up: from News to closely related Blog Posts• Details of the Proposed Method
– Ranking related Wikipedia Entries given a News Article
– Ranking related Bloggers/Blog Posts given an Index Wikipedia Entry
• Evaluation• Conclusion and Future Works
35
Evaluation:before/after Manual Selection of Relevant Wikipedia Entries
A News Article on �Kyoto Protocol Blog
Wikipedia
ý/#YZ��(Post Kyoto Protocol
Negotiation)
C ��(United nations)
��(Protocol)
�����(Carbon dioxide)
9��V��C(United States)
����#(Debate)
YZ(Kyoto)
�����/(Greenhouse gas)
C�Ö (Minister)
ý�*,)(Poland)
Top 10 Ranked Wikipedia Entries
36
Evaluation:before/after Manual Selection of Relevant Wikipedia Entries
A News Article on �Kyoto Protocol Blog
Wikipedia
ý/#YZ��(Post-Kyoto Protocol
Negotiation)
C ��(United nations)
��(Protocol)
�����(Carbon dioxide)
9��V��C(United States)
����#(Debate)
YZ(Kyoto)
�����/(Greenhouse gas)
C�Ö (Minister)
ý�*,)(Poland)
Manually Selected Relevant 3 Wikipedia Entries
37
Evaluation Results: Precision of Top Ranked Blog Posts
¸
�¸
!¸
»¸
"¸
Ô¸
¹¸
#¸
$¸
�¸
�¸¸
%&ª« '*��:-�,#, YZ�� YZ��missile Hillary Clinton Kyoto Protocol
Ranking Blog Posts with All of the Top 10 Wikipedia Entries including General Terms which
damage the Blog Posts Ranking
Ranking Blog Posts after Manually Selecting Relevant Specific Terms, excluding General
Terms which damage the Blog Posts Ranking
6%=
38
Conclusion and Future Works• Purpose of the Work:
Linking Topics of News and Blogs with Wikipedia• Details of the Proposed Method
– Ranking related Wikipedia Entries given a News Article – Ranking related Bloggers/Blog Posts
given an Index Wikipedia Entry
• Future Works:– Automatic Selection of Related Wikipedia Entries– Implementing Complementary Navigation System– Evaluation by Real Users