divided pretreatment to targets and intentions for query recommendation reporter: yangyang kang...

Download Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang 2012-11-04 1/23

If you can't read please download the document

Upload: barbara-adams

Post on 17-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Introduction Recommend identical or related queries through understanding user’s query intention Main research form Currently – Relevant or pseudo-relevant text contents – Click or browsing retrieval behaviors Ignore analyze intention from query itself – Spare information – Fuzzy structural relation – Optional component form 3/23

TRANSCRIPT

Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23 Outline Introduction Related Work Approach Experiment Conclusion and Future Work 2/23 Introduction Recommend identical or related queries through understanding users query intention Main research form Currently Relevant or pseudo-relevant text contents Click or browsing retrieval behaviors Ignore analyze intention from query itself Spare information Fuzzy structural relation Optional component form 3/23 Definition Query Target target entity, behavior or status Query Intention operations of intention or motivations of retrieval which impose on target Example Where is Soochow University? Target: Soochow University Intention: Where is 4/23 Outline Introduction Related Work Approach Experiment Conclusion and Future Work 5/23 Related Work Document-based approach Divide into three categories: global document, local document, manual editing corpus (Nick et al., 2007; Yanan Li et al., 2010) Log-based approach Divide into two categories: session-based approach, click-based approach(Mei et al., 2008; Cao et al., 2008) 6/23 Outline Introduction Related Work Approach Experiment Conclusion and Future Work 7/23 Approach Offline-system Target & Intention Recognition Intention Cluster Online-system Query Recommendation 8/23 Target & Intention Recognition Query Preprocessing Beaze-Yates divide query into informational, not informational and ambiguous We focus on informational queries by following rules Short queries less than two words Titles of news or notice Classifier NaiveBayes 10 cross-validations 9/23 Target & Intention Recognition Feature Selection Lexical-based Perspective Context-based Perspective 10/23 Intention Cluster Intention Vector collection of intention words To build the associated network of intentions 11/23 Intention Cluster Intention Similarity Calculation Vector Space Model + Cosine Similarity Weight values(TF-IDF) Language Model + KL Divergence normalize and smooth 12/23 Query Recommendation Give a query Recognize the target and intention of the query Mine the targets co-occurred with the intention in the global samples Build the description of intention vector Measure the similarity with the prior intention clusters Choose the most similar cluster as candidate intention set according to the similarity ranking Combine the candidate intention words with the target to form new queries 13/23 Outline Introduction Related Work Approach Experiment Conclusion and Future Work 14/23 Corpus Sogou2008 query logs 1,902,402 informational queries and group queries with same clicked URLs Classification experiment 968 group queries with an average of 5 Human-label(three volunteers + cross-validation) Cluster experiment Select 1,981 intention words randomly Human-label(three volunteers + cross-validation) Recommendation experiment 2,000 queries randomly Divide equally to six sample spaces Six volunteers 15/23 Evaluation Method Classification experiment Precision, recall, f-value and global accuracy Through statistical, the percentage of target words to intention words is close to 2:1 Cluster experiment Precision, macro-average Recommendation experiment Global precision(G-P) Consistent precision(C-P) Relevant precision(R-P) 16/23 Results & Analysis Classification experiment Sys-1: lexical-based approach Sys-2: Context-based approach Sys-3: combine two approaches Baseline1: assume all words in query are targets Baseline2: assume all words in query are intentions 17/23 Results & Analysis Cluster experiment Sys-VSM : Vector Space Model Sys-KL : Language Model 18/23 Results & Analysis Recommendation experiment 19/23 Results & Analysis Recommendation experiment 20/23 Outline Introduction Related Work Approach Experiment Conclusion and Future Work 21/23 Conclusion & Future work Conclusion Propose a new QR method which concentrates on query itself Recognize target and intention classification Obtain synonymous or related intention words for recommedation Future work Focus on feature selection method to enhance the intention description Use semantic intention matching algorithm Divide entity roles Employ into similarity matching process Analyze the combination of target and intention words, form a fluent and logical query 22/23 Thank you! Q & A 23/23