link analysis on the web an example: broad-topic queries xin
TRANSCRIPT
![Page 1: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/1.jpg)
Link Analysis on the WebAn Example: Broad-topic Queries
Xin Xin
![Page 2: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/2.jpg)
Problem
• Specific queries: “Does Netscape support the JDK 1.1 code-signing API?”
• Broad-topic queries: “Find information about the Java programming language.”
• Authority is important in broad-topic queries
WebQuery: “java”
1. http://java.sun.com
2. http://sunsite.unc.edu/javafaq/javafaq.html
3. …
![Page 3: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/3.jpg)
Why to use link analysis comparing to content information?
Query: Harvard
“Harvard” occurring times: 4
Harvard Homepage Other page introducing Harvard
“Harvard” occurring times: 8
Query: Search engines
“Search engines” occurring times: 0
Yahoo! Homepage Other page introducing search engines
“Search engines” occurring times: 4
![Page 4: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/4.jpg)
Graph Presentation
G=(V,E)
V: pages
E: in-link and out-link
Adjacency matrix
1
2
43
p1 p2 p3 p4
p1
p2
p3
p4
1
1
1
1
1
Given a query, how to find the most authoritative page through these link information?
![Page 5: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/5.jpg)
Overview
Web
Query: “java”
1. http://java.sun.com
2. http://sunsite.unc.edu/javafaq/javafaq.html
3. …
1
2
43
1
2
1. Sub-graph construction
2. Hubs and authorities computation
![Page 6: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/6.jpg)
Step1: Sub-graph Construction• Challenge:
– Small in size– Rich in relevant pages– Contains most of the strongest authorities
![Page 7: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/7.jpg)
Step2: Hubs and Authorities
• Basic Idea: in-degree
• Problem:
![Page 8: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/8.jpg)
Step2: Hubs and Authorities
![Page 9: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/9.jpg)
Step2: Hubs and AuthoritiesAn Iterative Algorithm:
![Page 10: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/10.jpg)
Simple Example 1
1
2
43
(x,y):
x=hub score
y=authority score
(1/4,1/4)
(1/4,1/4)
(1/4,1/4)
(1/4,1/4)
![Page 11: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/11.jpg)
Simple Example 2
1/ 7
4 / 7
1/ 7
1
2
43
(1/4,1/4)
(1/4,1/4)
(1/4,1/4)
(1/4,1/4)
Hub :
1: 1/4
2: 1/4+1/4
3: 1/4
4: 1/4
Authority :
1: 1/4+1/4+1/4
2: 1/4
3: 0
4: 1/4
1/ 7
9 /11
1/11
1/11
0
![Page 12: Link Analysis on the Web An Example: Broad-topic Queries Xin](https://reader036.vdocuments.us/reader036/viewer/2022081512/56649edc5503460f94becb6f/html5/thumbnails/12.jpg)
Page Rank