location-based topic evolution haiqin yang, shouyuan chen, michael r. lyu, irwin king the chinese...

26
Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Upload: caitlin-phillips

Post on 13-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Location-Based Topic Evolution

Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King

The Chinese University of Hong Kong

1

Page 2: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Outline

Motivation Location-Based Topic Evolution Model Experiments Conclusion

2

Page 3: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Location Information is attainable IPGPS3G, Wi-FiNFC

New Mobile Technologies

3

Page 4: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Geo-information

Twitter Typhoon trajectory estimation Earthquake location [Sakaki et

al.,WWW’10] Flickr

Geo-tagged photos [Crandall et al., WWW’09]

Geofolk [Sizov, WSDM’10]

4

Page 5: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

New Applications-Timeliness

Identify users’ interests in a region

5

Page 6: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

New Applications-Commercial Value

Determine appropriate marketing strategy

6

Page 7: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Solution-Topics Learning

Topics: Distributions over words Location-associated documents

Geo-informaiton with message, posts, tags

Help to learn the topics more accurately

7

Page 8: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Current Problems

Do not consider appearance and disappearance of topics

Do not model topic evolution Have to determine the number of

topics Location-aware Topic Model [Wang et al.

GIR’07] Geofolk [Sizov, WSDM’10] Geographical topic discovery [Yin et al.

WWW’11]8

Page 9: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Our Contributions

Propose a location-based topic evolution (LBTE) model Model topic changes of users’ interests

in a region Allow for appearance and disappearance

of topics Automatically determine topic numbers

Efficient inference

9

Page 10: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Problem Setup

Vocabulary: Data:

Objective: modeling the topics of data with an unknown number of topics and parameters.

10

Page 11: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Assumptions

Documents from unknown topics Topic from hidden

functions, determined by the function value

Functions from a probability measure

11

Page 12: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Evolution with Regions

Domains of functions include regions Values of functions represent topics

12

Page 13: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Evolution with Regions and Time

The beginning (end) of function domain correspond to appearance (disappearance) of a topic

13

Page 14: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Generative Process

l

llll

H

Ghh

DDDh

G

l

llll

GGGGl

N

lll

~.3

.2

~,~.1

:, 1

w

w

documents Generate

topicszeCharacteri

DP functionsGenerate

nsobservatio generative of process The

14

by zedparameteri ondistributiy probabilit a is

location aat processDirichlet a is DP~

H

GD GG

Page 15: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Inference-Gibbs Sampler

1. Sample auxiliary variables: To determine whether the domain of the function contain the region (Bernoulli)

2. Sample assignment: Calculate the probability of assigning to existing function and that of assigning to a new function

3. Draw topics parameters15

Page 16: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Experiments

Datasets Synthetic data Flickr data

Comparison methods Dirichlet Process Mixture (DPM) Location-Based Topic Evolution (LBTE)

16

Page 17: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Synthetic Data

Topics Generation Topics Initialization-Two topics

Center: Parameter:

Topics Evolution Die off rate 40% New topic follows Poisson distribution with parameter

0.8. Location-associated Documents Generation

10 documents for each topic Location of each documents follows the uniform

distribution at the center of the topic with radius, 5 Values of topics follow

17

Page 18: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Results of Synthetic Data

LBTE outperforms the DPM at all the time stamps

18

LBTE recovers true topics and achieves zero variation of information

Page 19: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Flickr Data

Geo-tagged photos crawled from 2009/01/01 to 2010/01/01

Only in USA territory.

19

An example{ "date": "2009-07-07 19:34:04", "lat": "36.058961", "lon": "-112.083442", "id": "5919764020", "tags": [ "grandcanyon", "nationalpark", "sunset", "limestone", "scenic"] }

Page 20: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Results of National Park

Topics learned from DPM are scattered

20

Page 21: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Results of National Park LBTE utilizes location information and

discovers topics based on the regions

21

Yellow Stone

Grand Canyon

Big Bend

Joshua Tree

Page 22: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Results of National Park

22

Page 23: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Conclusion

Advantages of Location-based Topic Evolution Model Automatically modeling the number of

total topics Automatically modeling topics’

appearance and disappearance Succinct sampling-Gibbs sampling

23

Page 24: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Thank you !

24

Page 25: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Sample Auxiliary Variables

Page 26: Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1

Sample Assignment