ecdl 2010 - measuring effectiveness of geographic ir systems in digital libraries: evaluation and...
Post on 05-Jul-2015
646 Views
Preview:
DESCRIPTION
TRANSCRIPT
Damien Palacio - damien.palacio@univ-pau.fr 1
ECDL 20106-10 september 2010
Measuring Effectiveness ofGeographic IR Systems
in Digital Libraries:Evaluation and Case Study
Damien Palacio, Guillaume Cabanac,
Christian Sallaberry, Gilles Hubert
2
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
3
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
4
1. Motivation – Why Geographic IR?
Geographic Information Retrieval➔ Query = ''trip around Glasgow in summer 2010''
➔ Search Engines➔ Topical term ∈ {trip, Glasgow, summer, 2010}
spatial ∈ {citiesNearGlasgow ...}➔ Geographic temporal ∈ {21june .. 22sept 2010}
term ∈ {trip, Glasgow, summer, 2010}
➔ ≈ 1/6 Queries = Geographic Queries➔ Excite (Sanderson et al., 2004)
➔ AOL (Gan et al., 2008)
➔ Yahoo! (Jones et al., 2008)
➔ Current Issue and Realistic
5
1. Motivation – Why Geographic IR?
A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:
➔ Spatial, temporal and topical
➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...
6
1. Motivation – Why Geographic IR?
A Geographic IRS: How Does It Work?➔ Spatial Processing
7
1. Motivation – Why Geographic IR?
A Geographic IRS: How Does It Work?➔ 3 Dimensions to Process:
➔ Spatial, temporal and topical
➔ 1 Index per Dimension➔ Topical bag of words, vector space model, ...➔ Spatial named entity recognition, ...➔ Temporal named entity recognition, ...
➔ Retrieval➔ Usually by filtering (STEWARD, SPIRIT, CITER, …)
➔ Issue: Performance of GIRS vs. topical IRS➔ Hypothesis: Geographic IRS better than topical IRS
8
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
9
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Geo IR litterature Topical IR litterature
10
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Geo IR litterature Topical IR litterature
Computation time
Storage needed
11
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
12
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
TopicalTemporal
Spatial
13
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
TopicalTemporal
Spatial
TREC, CLEF, ...
14
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
TopicalTemporal
Spatial
TREC, CLEF, ...TempEval
15
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
TopicalTemporal
Spatial
TREC, CLEF, ...TempEval
Bucher et al. (2005)GeoClef
16
2. Context and Issue: IRS Partial Evaluation
Evaluating an IR System➔ System = efficiency + effectiveness
➔ Effectiveness Evaluation
Computation time
Storage needed Quality
Geo IR litterature Topical IR litterature
TopicalTemporal
Spatial
TREC, CLEF, ...TempEval
Bucher et al. (2005)GeoClefEvaluation
framework proposed
17
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
18
➔ Goal: measuring GIRS quality
➔ Means: building on TREC framework (1992-)
➔ ''Cranfield'' methodology➔ Test collection
➔ Corpus➔ ≥ 25 Topics➔ Qrels
➔ Measures: P@X, MAP, NDCG, ...
[Voorhees, 2007]
3. Proposition – GIRS Evaluation Framework
Evaluation Framework for the 3 Dimensions (1/2)
19
3. Proposition – GIRS Evaluation Framework
Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension
➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources
Covering the 3 dimensions
20
3. Proposition – GIRS Evaluation Framework
Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension
➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources
➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the
better it is''
Covering the 3 dimensions
No dimension
3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton
3 dimensions + global=
Satisfied topic
21
3. Proposition – GIRS Evaluation Framework
Evaluation Framework for the 3 Dimensions (2/2) ➔ TREC Framework Extension
➔ Test collection➔ ≥ 25 Topics➔ Corpus➔ Gradual qrels➔ + geographic ressources
➔ About qrels … ➔ Relevance (doc, topic) ∈ {0;1;2;3;4}➔ Principle: ''the more satisfied dimensions there are, the
better it is''
➔ Gradual qrels aware measure:Normalized Discounted Cumulative Gain [Järvelin & Kekäläinen, 2002]
➔ By topic: NDCG for each topic
➔ Global: meanNDCG for the system
Covering the 3 dimensions
No dimension
3 dimensions: Topic: ''trip around Glasgow'' Doc: trip + Bob born in Dumbarton
3 dimensions + global=
Satisfied topic
22
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
23
4. Experiments – Case Study with PIV GIRS
Case Study: PIV System➔ Indexing: 1 index per dimension
➔ Topical = Terrier IRS [Ounis et al, 2005] ➔ Spatial = map segmentation into tiles➔ Temporal = timeline segmentation into tiles
➔ Retrieval➔ Result document list for each index➔ Results combination with CombMNZ [Fox & Shaw, 1993; Lee, 1997]
CombMNZ
24
4. Experiments – Case Study with PIV GIRS
CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]
25
4. Experiments – Case Study with PIV GIRS
CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]
26
4. Experiments – Case Study with PIV GIRS
CombMNZ Principle [Fox & Shaw, 1993; Lee 1997]
27
4. Experiments – Case Study with PIV GIRS
Case Study: MIDR_2010 collection➔ Building Qrels: 12 volunteers (thanks!!!)
31 topics
5645 documents
=
paragraphs
Map for tracking spatial
information
Qrels
Relevance judgments
{0;1;2;3;4}
28
4. Experiments – Hypothesis Validated
Analysis of Collected Data➔ IRS Evaluation
➔ ResultsList × Qrels NDCG
➔ Results: geographic IRS most effective
trec_eval
Hypothesis
29
4. Experiments – Hypothesis Validated
Analysis of Collected Data➔ Results: geographic IRS most effective
30
Outline
1. Motivation Topical IR Geographic IR→Hypothesis: GIRS > IRS
2. Context IRS evaluation Issue Current evaluation frameworks
= partial
3. Contribution GIRS evaluation framework
4. Experiments Case study with PIV GIRSHypothesis validated
5. Conclusion and Future Works
31
Evaluation framework for Geographic IR Systems
Conclusions and Future Works (1/2)➔ Evaluation Framework for Geographic IR Systems
➔ Reusable➔ Generalizable for more dimensions: confidence,
freshness, ... [Costa Pereira et al., 2009]
➔ Not gradual relevance per dimension
➔ Case Study with PIV System➔ Creation of a specific test collection (≥ 25 topics)➔ French test collection➔ Limited collection (number of documents)
32
Evaluation Framework for Geographic IR Systems
Conclusions and Future Works (2/2)➔ Hypothesis Validated
➔ The 3 dimensions improve IR (+66.5%)
➔ Future Works➔ More precise analysis: by query➔ Quantify PIV improvements: various indexes combinations➔ Organize a GIRS evaluation campaign: anyone interested?
Damien Palacio - damien.palacio@univ-pau.fr 33
ECDL 20106-10 september 2010
Thank you!
34
Spatial Interface
35
Spatial Interface
36
Temporal Interface
37
Temporal Interface
38
Spatial Tiling
top related