web markov skeleton processes and their applications zhi-ming ma 18 april, 2011, bnu. email:...

61
Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: [email protected] http://www.amt.ac.cn/member/mazhiming/index.html

Upload: lilian-norris

Post on 18-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Web Markov Skeleton Processes and their Applications

Zhi-Ming Ma 18 April, 2011, BNU.

Email: [email protected] http://www.amt.ac.cn/member/mazhiming/index.html

Page 2: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

• Y. Liu, Z. M. Ma, C. Zhou: Web Markov Skeleton Processes and

Their Applications, to appear in Tohoku Math J.

• Y. Liu, Z. M. Ma, C. Zhou:

Further Study on Web Markov Skeleton Processes

Page 3: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Web Markov Skeleton Process

Markov Chain

conditionally independent given

Page 4: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Define by :

WMSP

Page 5: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Simple WMSP:

Many simple WMSPs are Non-Markov Processes

Page 6: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

[LMZ2011a,b]

Page 7: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Mirror Semi-Markov Process

Mirror Semi-Markov Process is not a Hou’s Markov Skeleton Process, i.e. it does not satisfy

Page 8: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Time Homogeneous WMSP

Page 9: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

right continuous, piecewise constant functions

Page 10: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Stability of Time homogeneous WMSP

Theorem [LMZ 2011a,b]

for all

Page 11: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

WMSP

Multivariate Point Process associated with WMSP

Page 12: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 13: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 14: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Let

Page 15: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 16: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Consequently where

Define

We can prove that

Page 17: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

where

Page 18: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Why it is called a Web Markov Skeleton Process?

Page 19: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 20: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 21: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

How can google make a ranking of 1,950,000 pages

in 0.19 seconds?

Page 22: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Web page Ranking

Web page Ranking

Importance Ranking

Importance Ranking

Relevance Ranking

Relevance Ranking

Page 23: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

HITS1998 Jon Kleinberg Cornell University

PageRank

1998 Sergey Brin and Larry Page

Stanford University

The first major improvement

in the history of Web search engine

科学时报.pdf

Page 24: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Ranking Web pages by the mean frequency of visiting pages

From probabilistic point of view,

PageRank is the stationary distribution of a Markov chain.

Page Rank, a ranking algorithm used by the Google search engine.

1998, Sergey Brin and Larry Page , Stanford University

Page 25: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Markov chain describing surfing behavior

Page 26: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Markov chain describing surfing behavior

Page 27: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Web surfers usually have two basic ways to access web pages:

1. with probability α, they visit a web page by clicking a hyperlink.

2. with probability 1-α, they visit a web page by inputting its URL address.

Page 28: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

where

Page 29: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

More generally we may consider personalized d.:

PageRank is defined as the stationary distribution:

By the strong ergodic theorem: mean frequency of visiting pages

Page 30: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Weak points of PageRank

• Using only static web graph structure• Reflecting only the will of web managers, but ignore the will of users e.g. the staying

time of users on a web.• Can not effectively against spam and junk

pages.

BrowseRankSIGIR.ppt

Page 31: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Data Mining

Page 32: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Browsing Process

• Markov property

• Time-homogeneity

Page 33: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 34: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 35: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 36: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Computation of the Stationary Distribution

– Stationary distribution:

– is the mean of the staying time on page i.

The more important a page is, the longer staying time on it is.

– is the mean of the first re-visit time at page i. The more important a page is, the smaller the re-visit time is, and the larger the visit frequency is.

( )P t

Page 37: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

• Properties of Q process: – Jumping probability is conditionally independent

from jumping time: •

– Embedded Markov chain:• is a Markov chain with the transition probability

matrix

Computation of the Stationary Distribution

Page 38: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

– is the stationary distribution of – The stationary distribution of discrete model

is easy to compute• Power method for

• Log data for

Computation of the Stationary Distribution

Page 39: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 40: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

BrowseRank: Letting Web Users Vote for Page Importance

Yuting Liu, Bin Gao, Tie-Yan Liu, Ying Zhang,

Zhiming Ma, Shuyuan He, and Hang Li

July 23, 2008, Singapore the 31st Annual International ACM SIGIR

Conference on Research & Development on Information Retrieval.

Best student paper !

Page 41: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 42: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 43: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 44: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 45: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 46: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

• Browse Rank the next PageRank

says Microsoft

•jerbrowser.wmv

Page 47: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

• Browsing Processes will be a

Basic Mathematical Tool in

Internet Information Retrieval

Beyond:

--General fromework of Browsing Processes?

--How about inhomogenous process?

--Marked point process

--Mobile Web: not really Markovian

Page 48: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

ExtBrowseRank and semi-Markov processes

Page 49: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 50: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 51: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 52: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

[10] B. Gao, T. Liu, Z. M. Ma, T. Wang, and H. Li

A general markov framework for page importance computation, In proceedings of CIKM '2009,

[11] B. Gao, T. Liu, Y. Liu, T. Wang, Z. M. Ma and H. LI

Page Importance Computation based on Markov Processes, to appear in Information Retrieval

online first: <http://www.springerlink.com/content/7mr7526x21671131

Web Markov Skeleton Process

Page 53: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 54: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 55: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Thank you !

Page 56: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

The statistical properties of a time homogeneous mirror semi-Markov process is completely determined by:

Page 57: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Reconstruction of Mirror Semi-Markov Processes

We can construct

such that

Given: ,

,

Theorem [LMZ 2011b]

Page 58: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

uniformly

Page 59: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn

Write

is expressed as

[LMZ2011b]

Page 60: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn
Page 61: Web Markov Skeleton Processes and their Applications Zhi-Ming Ma 18 April, 2011, BNU. Email: mazm@amt.ac.cn