gradual adaption model for estimation of user information access behavior j. chen, r.y. shtykh and...

17
Gradual Adaption Model for Estimation of User Information Access Behavior J. Chen, R.Y. Shtykh and Q. Jin Graduate School of Human Sciences, Waseda University, Japan

Upload: herbert-fleming

Post on 29-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Gradual Adaption Model for Estimation of User Information Access Behavior

J. Chen, R.Y. Shtykh and Q. JinGraduate School of Human Sciences,

Waseda University, Japan

April 19, 2023 Waseda University 2

Background

• Why do we need information– In leisure: search a route or map for tour– In work: search business information– In learning: search academic papers

• Where do we get information– From traditional media such as books, magazines, etc.– From Internet

• How do we get information– To search it from a bookstore, library, etc.– To search it from Internet

• What do we have to face in information search– Too many search results including trashes

April 19, 2023 Waseda University 3

Information Recommendation

• Information recommendation– Web mining approaches

• Usage mining• Structure mining• Content mining• Semantic mining

– Web mining data• Content data: text and multimedia provided by web sites.• Structure data: organization inside a web page, internal and

external links, and the web site hierarchy.• Usage data: access logs data of web sites.• User profile: information data of users.• Semantic data: the data describe the structure and definition

of semantic web sites.

April 19, 2023 Waseda University 4

Study Approach

• Proposing a gradual adaption model for estimation of user information access behavior

• Analyzing a variety of users' information access data in terms of short, medium, long periods, and by remarkable and exceptional categories, and based on Full Bayesian Estimation

• Conducting experimental simulation to show the operability and effectiveness of the proposed model

April 19, 2023 Waseda University 5

Related Works

• WUM (Web Usage Mining)– Based on implicit users’ feedback

• A new document representation model (Poblete and Baeza-Yates, WWW2008)

– Experimented on a web site with a small number of vocabularies and specific to certain topics.

• Indentifying relevant web sites from user activities (Bilenko and White, WWW2008)

– Needs to spend more time to train the system– Personalize information recommendation

• Dynamic Link Generation (Yan, et al, WWW1996) – Consists of off-line and on-line modules

• SUGGEST 3.0 (Baraglia and Silvestri, WT2004) – For large web sites, and only have on-line module. But the size of logs

used to evaluated the system is small and limited.• LinkSelector (Fang and Sheng, ACM2004)

– Hyperlinks-structural and theirs access logs were used.

April 19, 2023 Waseda University 6

Definitions of Keyword, Link and Concept

• Keyword– Keywords in web pages

• Link– Web pages’ link

• Concept– Consists of a number of keywords and links

nature painting

Aristotle

Andersen cartoon

culture

Leonardo

Concept:

Philosophy

Concept:Literature

Concept:Art

Link a

Link b

April 19, 2023 Waseda University 7

Full Bayesian Estimation

• Full Bayesian Estimation• Ð is a data collection of concept• dt is the current number of click times of a concept• df is the current number of click times that a concept not be clicked• αt is the history number of click times of a concept

• αf is the history number of click times that a concept not be clicked

dptDP m )Ð|()Ð|( 1

d

dd

ddfftt dd

fftt

fftt 11 )1()()(

)(

ftft

tt

dd

d

April 19, 2023 Waseda University 8

Gradual Adaption Model

WebDocuments

Concept KB

Access Logs

Concept Analyser

Probability Estimator

Estimation Base

Input

GradualAdaptionRecommender

ShortMediumLongRemarkable / Exceptional

Matching

Search Query

Search

Click

Off-lineOn-line

April 19, 2023 Waseda University 9

Gradual Adaption Model

• We divide users’ interests into three terms of short, medium, long periods, and by remarkable, exceptional categories.

• This model is an adaptive one. – It can adapt to a transition of users’

information access behaviors.

• In the model, training is not needed, since the model uses Full Bayesian Estimation that has a learning function.

April 19, 2023 Waseda University 10

Gradual Adaption Model

Web Documents

Concept KB

Access Logs

Concept Analyser

Probability Estimator

Estimation Base

Input

GradualAdaptionRecommender

ShortMediumLongRemarkable / Exceptional

Matching

Search Query

Search

Click

Off-lineOn-line

April 19, 2023 Waseda University 11

Simulation and Evaluation

• Environment– Java, Tomcat, MySQL, and Nekohtml

• Data– Wikipedia on DVD Version 0.5

• more than 2000 web pages that belong to more than 180 concepts

April 19, 2023 Waseda University 12

Simulation and Evaluation

• Short period (such as 7 days / 1 week)– Test case

• This case is a user who has two interests, and these interests are affected by some factors easily.

• The expectation is that there is a possibility that the probability of the relation concept can change hugely in short or medium period, but not in long period.

• Two concepts of “Art” and “Artists” are assumed to be used, and the number of clicks is dynamically varying.

– Test result• The movement of the concept’s

rate changing frequently. • In some days, the probability of

concepts in short period is bigger than long period.

short period

0

0.2

0.4

0.6

0.8

1

1.2

Date

Pro

babi

lity

Art Artists Philosophers Philosophical thought movements

April 19, 2023 Waseda University 13

Simulation and Evaluation

• Medium period (such as 30 days / 1 month)– Test case

• This case is a user who has a temporary interest. • The user access the concept of temporary interest sometime. • The expectation is that this concept ought to keep a low rate in the

three periods.• One concept “Philosophers” is assumed to be used per three days,

– Test result• The change is becoming smaller. • But the probability of concepts in short period is bigger than medium period in some days.

medium period

00.10.20.30.40.50.60.70.80.9

Date

Pro

babi

lity

Art Artists Philosophers Philosophical thought movements

April 19, 2023 Waseda University 14

Simulation and Evaluation

• Long period (such as 90 days / 3 months)– Test case

• This case is a user who has a long-term interest. • The expectation is that the probability of the interested concept

ought to keep a high rate in long period.• One concept “Philosophical thought movements” is assumed to be

used everyday,

– Test result• The change becomes quite stable. • There is no big change in the

long period.

long period

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Date

Pro

babi

lity

Art Artists Philosophers Philosophical thought movements

April 19, 2023 Waseda University 15

Conclusion

• In this study, we have proposed a gradual adaption model (GAM) for estimation of user information access behavior.

• The three periods of GAM can correctly distinguish long-term and temporary interest of users even if has no system training.

April 19, 2023 Waseda University 16

Future Works

• To set more different patterns for short, medium and long periods to find more reasonable ones.

• To evaluate the proposed model with users' involvement.

• To compare our proposed approach with other related recommendation models.

April 19, 2023 Waseda University 17

Thank you for your attention.