domain dependent query reformulation for web search date : 2013/06/17 author : van dang, giridhar...
TRANSCRIPT
![Page 1: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/1.jpg)
DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH
Date : 2013/06/17
Author : Van Dang, Giridhar Kumaran, Adam Troy
Source : CIKM’12
Advisor : Dr. Jia-Ling Koh
Speaker : Yi-Hsuan Yeh
![Page 2: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/2.jpg)
2
Outline
Introduction Method Experiment Conclusions
![Page 3: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/3.jpg)
3
Introduction (1/5)
Query reformulation techniques aim to modify user queries to make then more effective for retrieval.
Identify candidate terms that are similar to the words in the query and then expend or substitute the original query terms.
![Page 4: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/4.jpg)
4
Introduction (2/5)
Existing work : − Spelling correction− Stemming expends query with
morphological variants− Look for candidates that are semantically
related to the query terms
However, none of the techniques consider the task as domain dependent.
![Page 5: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/5.jpg)
5
Introduction (3/5)- Motivation
In the domain of commerce queries, however, expanding the query from ‘flat screen tv’ to ‘flat screen tv television’ drastically hurts NDCG.
![Page 6: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/6.jpg)
6
Introduction (4/5)- Goal
Provide different candidate terms for queries in deferent domain.
Build different reformulation systems provide substantially better retrieval effectiveness than having a single system handling all queries.
![Page 7: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/7.jpg)
7
Introduction (5/5)- Framework
Query logs
Pseudo parallel corpus
Translation model
New querie
s
Filer out bad
candidates(classifier)
Reformulation model
(domain-specific)
![Page 8: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/8.jpg)
8
Outline
Introduction Method
− Pseudo parallel corpus− Translation model− Reformulation model− Candidate query generation
Experiment Conclusions
![Page 9: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/9.jpg)
9
Pseudo parallel corpus
Parallel pair : <Query, Translation>1. Consecutive query pairs2. Query and clicked document title3. Query suggestion via Random Walk
(two)
q1
q2
D1
![Page 10: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/10.jpg)
10
Translation model (1/3)
Aim to provide multiple semantically related candidates as “translations” for any given query term.
The translation probability indicates similarity between the term and candidate.
![Page 11: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/11.jpg)
11
Translation model (2/3)
Expectation Maximization algorithm
Identify the most probable alignments for all parallel pairs.
S : source sentenceT : correct translationa = {a1, a1, …, am} : an alignment between S and TTr(tj | s(aj)) : translation probability
![Page 12: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/12.jpg)
12
Translation model (3/3)- Enhanced by learning
Train a boosted decision tree classifier which classifies if a <term, candidate> pair is desirable.
![Page 13: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/13.jpg)
13
Reformulation model (1/4) Given a query q = w1, ..., wi−1, wi, wi+1, ..., wn
and the candidate s for the query term wi.
The reformulation model determines whether or not to accept this candidate.
Consider:1. How similar s is to wi
2. How fit s is to the context of the query
![Page 14: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/14.jpg)
14
Reformulation model (2/4)
![Page 15: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/15.jpg)
15
Reformulation model (3/4)
G : L1, L2, R1 , R2
C(S) : the set of words that occur in the context of sCollection : all original and “translation” queries
The probability of seeing w (query term) in the context of s (candidate)
The probability of seeing w in the entire collection
![Page 16: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/16.jpg)
16
Reformulation model (4/4)- Domain-specific model
Estimated from the domain specific log
Estimated from the generic log
smooth
![Page 17: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/17.jpg)
17
Candidate queries generation Each of these candidates sij for the query
term wi is accepted if and only if:
Expend the original queries with these accepted candidates.
θ= 0.9
![Page 18: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/18.jpg)
18
Outline
Introduction Method Experiment Conclusions
![Page 19: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/19.jpg)
19
Experiment (1/6)
Query set− Web query log− Health− Commerce
Reformulation approach− Baseline: noalter (without reformulation)− Generic (domain independent)− Health − Commerce
Model parameters λ= 0.9, β= 0.9 and θ= 0.9.
![Page 20: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/20.jpg)
20
Experiment (2/6)- Transaction model filtering:effectiveness
![Page 21: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/21.jpg)
21
Experiment (3/6)- Generic vs. domain specific
![Page 22: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/22.jpg)
22
Experiment (4/6)- Reason for improvement
1. Eliminate ineffective general candidates provided by the generic model. (Reject)
2. Domain system provide additional domain-specific candidates. (Add)
![Page 23: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/23.jpg)
23
Experiment (5/6)- Domain-specific training data
![Page 24: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/24.jpg)
24
Experiment (6/6)- Failure analysis
1. Domain model misses several good candidates.
2. Domain system introduce bad candidates that are not provided by the generic system.
![Page 25: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/25.jpg)
25
Outline
Introduction Method Experiment Conclusions
![Page 26: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/26.jpg)
26
Conclusions (1/2)
Demonstrate the advantage of domain dependent query reformulation over the domain independent approach.
Using the same reformulation technique, both reformulation systems for the health and commerce domains outperform the generic system that learns from the same log.
![Page 27: DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling](https://reader030.vdocuments.us/reader030/viewer/2022032721/56649cd85503460f949a1399/html5/thumbnails/27.jpg)
27
Conclusions (2/2)- Future work
1. Higher-order contextual n-gram models
2. Consider the relationship among candidates
3. Train boosted decision tree with various feature