ecdl 2010 - measuring effectiveness of geographic ir systems in digital libraries: evaluation and...

38
Damien Palacio - [email protected] 1 ECDL 2010 6-10 september 2010 Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study Damien Palacio, Guillaume Cabanac, Christian Sallaberry, Gilles Hubert

Upload: guillaume-cabanac

Post on 05-Jul-2015

646 views

Category:

Technology


3 download

DESCRIPTION

Best Paper Award at ECDL 2010: the 14th European Conference on Research and Advanced Technology for Digital Libraries

TRANSCRIPT

Page 1: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Damien Palacio - [email protected] 1

ECDL 20106-10 september 2010

Measuring Effectiveness ofGeographic IR Systems

in Digital Libraries:Evaluation and Case Study

Damien Palacio, Guillaume Cabanac,

Christian Sallaberry, Gilles Hubert

Page 2: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

2

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 3: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

3

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 4: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

4

1. Motivation – Why Geographic IR?

Geographic Information Retrieval➔ Query = ''trip around Glasgow in summer 2010''

➔ Search Engines➔ Topical term ∈ {trip, Glasgow, summer, 2010}

spatial ∈ {citiesNearGlasgow ...}➔ Geographic temporal ∈ {21june .. 22sept 2010}

term ∈ {trip, Glasgow, summer, 2010}

➔ ≈ 1/6 Queries = Geographic Queries➔ Excite (Sanderson et al., 2004)

➔ AOL (Gan et al., 2008)

➔ Yahoo! (Jones et al., 2008)

➔ Current Issue and Realistic

Page 5: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

5

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:

➔ Spatial, temporal and topical

➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...

Page 6: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

6

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ Spatial Processing

Page 7: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

7

1. Motivation – Why Geographic IR?

A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:

➔ Spatial, temporal and topical

➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...

➔ Retrieval➔ Usually by filtering (STEWARD, SPIRIT, CITER, …)

➔ Issue: Performance of GIRS vs. topical IRS➔ Hypothesis: Geographic IRS better than topical IRS

Page 8: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

8

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 9: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

9

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Geo IR litterature Topical IR litterature

Page 10: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

10

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Geo IR litterature Topical IR litterature

Computation time

Storage needed

Page 11: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

11

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

Page 12: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

12

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

Page 13: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

13

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...

Page 14: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

14

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

Page 15: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

15

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

Bucher et al. (2005)GeoClef

Page 16: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

16

2. Context and Issue: IRS Partial Evaluation

Evaluating an IR System➔ System = efficiency + effectiveness

➔ Effectiveness Evaluation

Computation time

Storage needed Quality

Geo IR litterature Topical IR litterature

TopicalTemporal

Spatial

TREC, CLEF, ...TempEval

Bucher et al. (2005)GeoClefEvaluation

framework proposed

Page 17: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

17

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 18: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

18

➔ Goal: measuring GIRS quality

➔ Means: building on TREC framework (1992-)

➔ ''Cranfield'' methodology➔ Test collection

➔ Corpus➔ ≥ 25 Topics➔ Qrels

➔ Measures: P@X, MAP, NDCG, ...

[Voorhees, 2007]

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (1/2)

Page 19: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

19

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

Covering the 3 dimensions

Page 20: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

20

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the

better it is''

Covering the 3 dimensions

No dimension

3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton

3 dimensions + global=

Satisfied topic

Page 21: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

21

3. Proposition – GIRS Evaluation Framework

Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension

➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources

➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the

better it is''

➔ Gradual qrels aware measure:Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002]

➔ By topic: NDCG for each topic

➔ Global: meanNDCG for the system

Covering the 3 dimensions

No dimension

3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton

3 dimensions + global=

Satisfied topic

Page 22: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

22

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 23: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

23

4. Experiments – Case Study with PIV GIRS

Case Study: PIV System➔ Indexing: 1 index per dimension

➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles➔ Temporal = timeline segmentation into tiles

➔ Retrieval➔ Result document list for each index➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]

CombMNZ

Page 24: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

24

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

Page 25: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

25

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

Page 26: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

26

4. Experiments – Case Study with PIV GIRS

CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]

Page 27: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

27

4. Experiments – Case Study with PIV GIRS

Case Study: MIDR_2010 collection➔ Building Qrels: 12 volunteers (thanks!!!)

31 topics

5645 documents

=

paragraphs

Map for tracking spatial

information

Qrels

Relevance judgments

{0;1;2;3;4}

Page 28: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

28

4. Experiments – Hypothesis Validated

Analysis of Collected Data➔ IRS Evaluation

➔ ResultsList × Qrels NDCG

➔ Results: geographic IRS most effective

trec_eval

Hypothesis

Page 29: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

29

4. Experiments – Hypothesis Validated

Analysis of Collected Data➔ Results: geographic IRS most effective

Page 30: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

30

Outline

1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS

2. Context IRS evaluation Issue Current evaluation frameworks

= partial

3. Contribution GIRS evaluation framework

4. Experiments Case study with PIV GIRSHypothesis validated

5. Conclusion and Future Works

Page 31: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

31

Evaluation framework for Geographic IR Systems

Conclusions and Future Works (1/2)➔ Evaluation Framework for Geographic IR Systems

➔ Reusable➔ Generalizable for more dimensions: confidence,

freshness, ... [Costa Pereira et al., 2009]

➔ Not gradual relevance per dimension

➔ Case Study with PIV System➔ Creation of a specific test collection (≥ 25 topics)➔ French test collection➔ Limited collection (number of documents)

Page 32: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

32

Evaluation Framework for Geographic IR Systems

Conclusions and Future Works (2/2)➔ Hypothesis Validated

➔ The 3 dimensions improve IR (+66.5%)

➔ Future Works➔ More precise analysis: by query➔ Quantify PIV improvements: various indexes combinations➔ Organize a GIRS evaluation campaign: anyone interested?

Page 33: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

Damien Palacio - [email protected] 33

ECDL 20106-10 september 2010

Thank you!

Page 34: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

34

Spatial Interface

Page 35: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

35

Spatial Interface

Page 36: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

36

Temporal Interface

Page 37: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

37

Temporal Interface

Page 38: ECDL 2010 - Measuring Effectiveness of Geographic IR Systems in Digital Libraries: Evaluation and Case Study

38

Spatial Tiling