ecdl 2010 - measuring effectiveness of geographic ir systems in digital libraries: evaluation and...

Post on 05-Jul-2015

646 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Best Paper Award at ECDL 2010: the 14th European Conference on Research and Advanced Technology for Digital Libraries

TRANSCRIPT

Damien Palacio - damien.palacio@univ-pau.fr 1

ECDL 20106-10 september 2010

Measuring Effectiveness ofGeographic IR Systems

in Digital Libraries:Evaluation and Case Study

Damien Palacio, Guillaume Cabanac,

Christian Sallaberry, Gilles Hubert

2

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

3

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

4

1. Motivation – Why Geographic IR?

Geographic Information Retrieval➔ Query = ''trip around Glasgow in summer 2010''

➔ Search Engines➔ Topical term ∈ {trip, Glasgow, summer, 2010}

spatial ∈ {citiesNearGlasgow ...}➔ Geographic temporal ∈ {21june .. 22sept 2010}

term ∈ {trip, Glasgow, summer, 2010}

➔ ≈ 1/6 Queries = Geographic Queries➔ Excite (Sanderson et al., 2004)

➔ AOL (Gan et al., 2008)

➔ Yahoo! (Jones et al., 2008)

➔ Current Issue and Realistic

5

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:

➔ Spatial, temporal and topical

➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...

6

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ Spatial Processing

7

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:

➔ Spatial, temporal and topical

➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...

➔ Retrieval➔ Usually by filtering (STEWARD, SPIRIT, CITER, …)

➔ Issue: Performance of GIRS vs. topical IRS➔ Hypothesis: Geographic IRS better than topical IRS

8

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

9

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Geo IR litterature Topical IR litterature

10

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Geo IR litterature Topical IR litterature

Computation time

Storage needed

11

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

12

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

13

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...

14

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

15

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

Bucher et al. (2005)GeoClef

16

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

Bucher et al. (2005)GeoClefEvaluation

framework proposed

17

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

18

➔ Goal: measuring GIRS quality

➔ Means: building on TREC framework (1992-)

➔ ''Cranfield'' methodology➔ Test collection

➔ Corpus➔ ≥ 25 Topics➔ Qrels

➔ Measures: P@X, MAP, NDCG, ...

[Voorhees, 2007]

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (1/2)

19

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

Covering the 3 dimensions

20

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the

better it is''

Covering the 3 dimensions

No dimension

3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton

3 dimensions + global=

Satisfied topic

21

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the

better it is''

➔ Gradual qrels aware measure:Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002]

➔ By topic: NDCG for each topic

➔ Global: meanNDCG for the system

Covering the 3 dimensions

No dimension

3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton

3 dimensions + global=

Satisfied topic

22

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

23

4. Experiments – Case Study with PIV GIRS

Case Study: PIV System➔ Indexing: 1 index per dimension

➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles➔ Temporal = timeline segmentation into tiles

➔ Retrieval➔ Result document list for each index➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]

CombMNZ

24

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

25

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

26

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

27

4. Experiments – Case Study with PIV GIRS

Case Study: MIDR_2010 collection➔ Building Qrels: 12 volunteers (thanks!!!)

31 topics

5645 documents

=

paragraphs

Map for tracking spatial

information

Qrels

Relevance judgments

{0;1;2;3;4}

28

4. Experiments – Hypothesis Validated

Analysis of Collected Data➔ IRS Evaluation

➔ ResultsList × Qrels NDCG

➔ Results: geographic IRS most effective

trec_eval

Hypothesis

29

4. Experiments – Hypothesis Validated

Analysis of Collected Data➔ Results: geographic IRS most effective

30

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

31

Evaluation framework for Geographic IR Systems

Conclusions and Future Works (1/2)➔ Evaluation Framework for Geographic IR Systems

➔ Reusable➔ Generalizable for more dimensions: confidence,

freshness, ... [Costa Pereira et al., 2009]

➔ Not gradual relevance per dimension

➔ Case Study with PIV System➔ Creation of a specific test collection (≥ 25 topics)➔ French test collection➔ Limited collection (number of documents)

32

Evaluation Framework for Geographic IR Systems

Conclusions and Future Works (2/2)➔ Hypothesis Validated

➔ The 3 dimensions improve IR (+66.5%)

➔ Future Works➔ More precise analysis: by query➔ Quantify PIV improvements: various indexes combinations➔ Organize a GIRS evaluation campaign: anyone interested?

Damien Palacio - damien.palacio@univ-pau.fr 33

ECDL 20106-10 september 2010

Thank you!

34

Spatial Interface

35

Spatial Interface

36

Temporal Interface

37

Temporal Interface

38

Spatial Tiling

top related