michael bendersky, w. bruce croft dept. of computer science univ. of massachusetts amherst amherst,...
DESCRIPTION
Motivation Goal : retrieve more relevant documents to users Query Representation : 3 This paper term dependencies concept dependencies bag-of-wordsTRANSCRIPT
![Page 1: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/1.jpg)
Michael Bendersky , W. Bruce CroftDept. of Computer Science
Univ. of Massachusetts AmherstAmherst, MA
SIGIR 2012
1
![Page 2: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/2.jpg)
• Motivation• Query Hypergraphs• Ranking Documents• Parameter estimation• Evaluation• Conclusion
2
Outline
![Page 3: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/3.jpg)
Motivation• Goal : retrieve more relevant documents to
users• Query Representation :
3
This paper
term dependencies
concept dependencies
bag-of-words
![Page 4: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/4.jpg)
Example • ”Provide information on the use of dogs worldwide for law enforcement purposes.”
• bag-of-word { Provide, information, dog….}• term dependency {(Provide, information ),( law, enforcement)}• concept dependency {(dog, law enforcement),..}
4
![Page 5: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/5.jpg)
• ”Provide information on the use of dogs worldwide for law enforcement
purposes.”
5
Example(cont.)
{provide, information,( law, enforcement)} {(dog, law enforcement)}
![Page 6: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/6.jpg)
Model concept dependency
• Use Query Hypergraphs 1. build linguistic structure ” members of the rock group nirvana” 2. each element in the structures can be represented as a concept
6
![Page 7: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/7.jpg)
Query Hypergraphs• Query Hypergraph
7
(international art crime)
D: a document
V = {D,i,a,c,ac}
E = {({i},D),({a},D),({c},D),({ac},D),({i,a,c,ac},D)}
hyperedge
![Page 8: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/8.jpg)
Query Hypergraph Induction
• Three types of structures
8
• query term structure : individual query words • phrase structure : bi-gram (consider order)• proximity structure : arbitrary subsets of query terms
![Page 9: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/9.jpg)
Hyperedges• Local hyperedges ({k},D)• Global hyperedge ( ,D)
9
QK
k: a conceptQK : set of query concepts
k QK
![Page 10: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/10.jpg)
Ranking Documents• relevance score
10
Q: a queryD: a documente: a hyperedge E: set of hyperedges
Factor: )( ,Dkee
![Page 11: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/11.jpg)
Local Factors
11
)(k : the importance weight of the concept k
: a matching function between the concept k and the document D
![Page 12: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/12.jpg)
Matching Function
12
DCCktfDktf
Dkf
),(),(log),(
C: the collectionD
C
: the number of term in the document
: the number of term in the collection
: Dirichlet smoothing parameter
![Page 13: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/13.jpg)
• consider the dependency between the entire set of query concepts
13
Global Factor
: the highest score passage from the document
The dependency range is much longer for concept dependencies.
),( QKk : the importance weight of concept k in the context of the entire set of query concepts QK (with the concept in the passage )
![Page 14: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/14.jpg)
Example
14
{(dog, law enforcement)}
Don’t appear in the same sentence, but co-occurrence in a largertext passage.
![Page 15: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/15.jpg)
Query Hypergraph Parameterization
• Goal: parameterize concept weights (local & global)
15
)(k ),( QKk
• Parameterization By Structure• Parameterization By Concept
![Page 16: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/16.jpg)
Parameterization By Structure
16
: a structure
![Page 17: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/17.jpg)
• parameterize the concept weights based on the concepts themselves
17
Parameterization By Concept
concept importance feature
estimation
![Page 18: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/18.jpg)
Parameter Estimation• optimize a target metric (mean average
precision)• rely on a large collection• use coordinate ascent algorithm - a coordinate-level hill climbing search• repeatedly cycles through each of
parameters , while holding all other parameters fixed
18
)(
![Page 19: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/19.jpg)
19
Parameter Estimation(cont.)
Optimize the local component (the weight ))(k
retrieve top thousand documents
optimize the global component (the weight )),( QKk
![Page 20: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/20.jpg)
Parameter Estimation(cont.)
20
(Robust04 collection)
![Page 21: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/21.jpg)
Evaluation(testing)• search engine - Indri • test collections
• query
21
![Page 22: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/22.jpg)
Evaluation(evaluation metric)• MAP(mean average precision)
ex. Topic 1 : 3 個相關 (order: 1,3,5) (1/1+2/3+3/5)/3
• ERR@k (expected reciprocal rank, k=20)
22
1
11
))(1()( k
jj
k
i
i gRigR g= 0,1,2,3,4
R(g)=(2^g-1)/16
satisfied by doc k
not satisfied with previous doc (1~k-1)
![Page 23: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/23.jpg)
Evaluation(retrieval performance)
23
![Page 24: Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012 1](https://reader034.vdocuments.us/reader034/viewer/2022051123/5a4d1b187f8b9ab059992889/html5/thumbnails/24.jpg)
Conclusion• model arbitrary term dependencies as
concepts• uses passage-level evidence to model the
dependencies between the concepts • assign weight to both concepts and
concept dependencies• The proposed retrieval framework
improves the retrieval effectiveness for verbose natural queries.
24