dbtrends semantics 2016

48
DBtrends Exploring Query Logs for Ranking RDF Data AKSW Edgard Marx, Amrapali Zaveri, Diego Moussallem, Sandro Rautenberg 12th International Conference on Semantic Systems

Upload: edgard-marx

Post on 13-Apr-2017

29 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: DBtrends Semantics 2016

DBtrendsExploring Query Logs for Ranking RDF Data

AKSW

Edgard Marx, Amrapali Zaveri, Diego Moussallem, Sandro Rautenberg

12th International Conference on Semantic Systems

Page 2: DBtrends Semantics 2016

Outline• Motivation• Background• Ranking using Query Logs• Evaluation

• Results• Discussion

• Conclusion• Future Works

2AKSW

Page 3: DBtrends Semantics 2016

3

Personal Data Enterprise Data

Motivation

Open Data

AKSW

Page 4: DBtrends Semantics 2016

4

http://linkeddatacatalog.dws.informatik.uni-annheim.de/state/"The size of LOD by 2014 was 31 billion triples"

"Facebook users generates 2.7 billion Like actions per day and 300 million new photos are uploaded daily"

Josh Constine, 2012

We Have Data

"Google Processing 20,000 Terabytes A Day, And Growing"

Erick Schonfeld, 2008techcrunch.com

techcrunch.com

AKSW

Motivation

Page 5: DBtrends Semantics 2016

Not all ofdata is relevant

We Have DataMotivation

5AKSW

Page 6: DBtrends Semantics 2016

6

We Have DataMotivation

AKSW

Page 7: DBtrends Semantics 2016

We Have Data

7AKSW

Motivation

Page 8: DBtrends Semantics 2016

Ranking

8AKSW

Motivation

Page 9: DBtrends Semantics 2016

Scenarios

Search Machine Learning Link Discovery

9AKSW

Motivation

Page 10: DBtrends Semantics 2016

Resource Description Framework (RDF)

Concrete

E=MC²Abstract

10

Background

AKSW

Web of Data

Page 11: DBtrends Semantics 2016

Things

11

Background

AKSW

Web of Data

• Semantic Search• Entity Search• Question Answering• Named Entity Recognition• Link Discovery• Machine Learning

Use RDF Data

E=MC²

Page 12: DBtrends Semantics 2016

Ranking Functions (Types)

12

"Give me all persons"

AKSW

Retrieve

Processing &

Ranking

Background

...

Page 13: DBtrends Semantics 2016

Ranking Functions (Types)

13

"Give me all persons"

AKSW

Retrieve

Persons

Sort

Processing &

Ranking

Answer

Background

...

Page 14: DBtrends Semantics 2016

Ranking Functions (Types)

14

"Give me all persons"

AKSW

Retrieve

Persons

Sort

Processing &

Ranking

Answer

Background

...Query dependent Query independent

Page 15: DBtrends Semantics 2016

Ranking

15AKSW

Background

Page et al.1999

Page 16: DBtrends Semantics 2016

Ranking

16AKSW

Background

Page et al.1999

2001

Lee et al.

Web of Data

Page 17: DBtrends Semantics 2016

Ranking RDF Data

17AKSW

Background

Page et al.

2011

1999

Cheng et al. (Property)

2001

Lee et al.

Web of Data

Page 18: DBtrends Semantics 2016

Ranking RDF Data

18AKSW

Background

Page et al.

Thalhammer et al.

2011

1999

2014

Cheng et al. (Property)

2001

Lee et al.

Web of Data

Page 19: DBtrends Semantics 2016

Benchmarks

19

DBtrends Benchmark (Marx, 2016)

• 60 users from different countries (USA, India)• 9 entity ranking functions applied to DBpedia Knowledge Base• Users sort relevant classes, properties and entities

extracted from the top twenty entities belonging to the top four classes

• Task were executed using Amazon Mechanical TurkPrevious Benchmarks

• Not public available• Evaluate performace of 30 profiles

AKSW

Background

Page 20: DBtrends Semantics 2016

Why use query logs?

AKSW20

Ranking using Query Logs

Page 21: DBtrends Semantics 2016

Why use query logs?

AKSW21

Ranking using Query Logs

Page 22: DBtrends Semantics 2016

Why use query logs?

AKSW22

Ranking using Query Logs

Query Logs

search...

Page 23: DBtrends Semantics 2016

Why use query logs?

AKSW23

Ranking using Query Logs

Page 24: DBtrends Semantics 2016

Why use query logs?

• Query logs provide relevant information about user's preference

• They refer to the real-world entities

E=MC²

AKSW24

Ranking using Query Logs

Page 25: DBtrends Semantics 2016

Questions

• How to map real-world entitiesto Web of Data?

• How to measure it's relevance?• Where to find a good and trustable

query log?

AKSW25

Ranking using Query Logs

Page 26: DBtrends Semantics 2016

How to map real world resources?

• Rocha et al. (2004)• Ding et al. (2005)• Hogan et al. (2006)• Alsarem et al (2015)

AKSW26

Ranking using Query Logs

Query Logs

search...

Web of Data

Page 27: DBtrends Semantics 2016

How to measure the resource's relevance?

AKSW27

Ranking using Query Logs

• Users search (more often) for things that are relevant

• Query logs register how often something is searched

• Query logs can be used for better estimate resource's relevance by looking how oftenit is searched

Page 28: DBtrends Semantics 2016

Where to find a good and trustable query log?

AKSW28

Ranking using Query Logs

Page 29: DBtrends Semantics 2016

Where to find a good and trustable query log?

AKSW29

Ranking using Query Logs

Page 30: DBtrends Semantics 2016

Where to find a good and trustable query log?

• Public API• Filters

Geographic • Country• State• City

Period Day Week Month Year

AKSW30

Ranking using Query Logs

Page 31: DBtrends Semantics 2016

DBtrends Ranking Function

AKSW31

Ranking using Query Logs

Page 32: DBtrends Semantics 2016

DBtrends Ranking Function

AKSW32

Ranking using Query Logs

36Trendsdbr:New_York_City

“New York”

dbo:City

dbo:Place

2

1

1

• First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( )2-

Page 33: DBtrends Semantics 2016

DBtrends Ranking Function

18

36Trendsdbr:New_York_City

“New York”

dbo:City

dbo:Place

1

23

4

9 • First, the labels of the entities are extracted and used to acquire the search history in query logs e.g. GoogleTrends ( )

• Thereafter, the entity ranks are used as a base to propagate the rank to the classes

( )3 4-

2-

AKSW

1

33

Ranking using Query Logs

Page 34: DBtrends Semantics 2016

Entity Ranking Functions

• DBtrends• MIXED-RANK

• DB-IN • DB-OUT• DB-RANK

• PAGE-IN • PAGE-OUT• PAGE-RANK• E-PAGE-IN• SEO-PA• SHARED-LINKS

+

Evaluation

34AKSW

Page 35: DBtrends Semantics 2016

Property/Class Ranking Functions

• Instances• Instances

Property

Class

AKSW35

Evaluation

• Relin• RandomRank• Instances• Instances

Page 36: DBtrends Semantics 2016

Results

AKSW

• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA

• DB-OUT• PAGE-IN• PAGE-OUT• DB-IN• DB-RANK

36

Evaluation Entity

Page 37: DBtrends Semantics 2016

Results

AKSW

• MIXED-RANK• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA

• DB-OUT• PAGE-IN• DBtrends • PAGE-OUT• DB-IN• DB-RANK

37

Evaluation Entity

Page 38: DBtrends Semantics 2016

Discussion

AKSW

• Functions that take into consideration external information provide more insights about resource's relevance

• RDF Links reflect natural connections rather than resouce's relevance

• MIXED-RANK• PAGE-RANK• E-PAGE-IN• SHARED-LINKS• SEO-PA

• DB-OUT• PAGE-IN• DBtrends • PAGE-OUT• DB-IN• DB-RANK

Entity

38

Evaluation

Page 39: DBtrends Semantics 2016

Discussion

AKSW

• There is no pattern in the impact distribution of query longs

• Queries (not necessarly) help to improve a ranking functions

• Internal agreement ~63%

39

Evaluation Entity

Page 40: DBtrends Semantics 2016

Results

AKSW

• RandomRank• Relin• Instances• Instances

• Instances• Instances

Property

Class

40

Evaluation

Page 41: DBtrends Semantics 2016

Discussion

AKSW

• RandomRank• Relin• Instances• Instances

• Internal agreement ~37%• Ranks are very sparse• Not conclusive

41

Evaluation Property

Page 42: DBtrends Semantics 2016

Discussion

AKSW

• Internal agreement ~67%• Instances• Instances

42

Evaluation Class

Page 43: DBtrends Semantics 2016

Discussion

AKSW

dbo:PopulatedPlacedbo:Settlementdbo:Placeowl:Thing

A simple sort can be very

effective

43

Evaluation

dbo:PopulatedPlace

dbo:Settlement

dbo:Place

owl:Thing

• Instances• Instances

Class

Page 44: DBtrends Semantics 2016

Discussion

AKSW

• Confidence in executing the tasks: Indians 90% Americans 60%

• Ranks produced by Indians were more sparse

• Abstract entities appear before entities

44

Evaluation Caviats

Page 45: DBtrends Semantics 2016

Summary

AKSW

• Entity Ranking functions produce better results when considering external information

• A simple sort of the number of instances can be very effective for ranking classes

• Query logs can (not necessarily) improve entity ranking functions

45

Evaluation

Page 46: DBtrends Semantics 2016

Benchmark

AKSW

• Benchmark• Ranking functions• Library (Java)

46

Evaluation

dbtrends.aksw.org

Page 47: DBtrends Semantics 2016

Future Works

AKSW

• Extend the evaluation to other countries and ranking functions

• Evaluate the impact of contex-aware ranking functions

• Use others similarity ranking functions

47

Page 48: DBtrends Semantics 2016

Acknowledgements

48

AKSW

Contacthttp://emarx.org