mianwei zhou, kevin chen-chuan chang university of illinois at urbana-champaign unifying learning to...

Click here to load reader

Upload: charleen-palmer

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Entity-Centric Document Filtering: Boosting Feature Mapping through Meta-Features

Mianwei Zhou, Kevin Chen-Chuan ChangUniversity of Illinois at Urbana-ChampaignUnifying Learning to Rank and Domain Adaptation-- Enabling Cross-Task Document Scoring1

Document Scorer Learning with Queries/Domains 2

Spam Detection

Information RetrievalSentiment AnalysisHandle Various User NeedsQueriesTackle Diverse Sources of DataDomainsWith the development of Web 23Document Scorer Learning for Different Queries:Learning to RankExample Application: Entity-Centric Filtering (TREC-KBA 2012)Difficulty: Long Wikipedia pages as queries; Noise keywords in queries.Training PhaseTesting Phase

Wiki PageRelevantIrrelevantWiki PageRelevantIrrelevant

Wiki Page????4Document Scorer Learning for Different Domains:Domain Adaptation

I do not like this book, it is very boring ...Do not buy this juice extractor, it is leaking ...Example Application: Cross-Domain Sentiment Analysis (Blitzer 2006)Difficulty: Different domains use different sentiment keywords.Training PhaseTesting PhaseIt is a very interesting book Book Reviews

This coffee maker has high quality!

Kitchen Appliance ReviewsLearning to Rank VS Domain Adaptation5Training PhaseTraining PhaseTesting Phase

Testing Phase

BBoringInterestingbookBBill GatesMicrosofttalkBChicago BullBasketballtrainingBUglyLeakingkitchenCommon ChallengeDifferent keyword importance across training and testing phases.They are connected.56Problem: Cross-Task Document Scoring7Training PhaseTraining PhaseTesting Phase

Testing Phase

BBoringInterestingbookBBill GatesMicrosofttalkBChicago BullBasketballtrainingBUglyLeakingkitchenCross-Task Document ScoringUnify Learning to Rank and Domain Adaptation1. TaskQuery (learning to rank) Domain (domain adaptation)2. Cross-Task: Training and testing phases tackle different tasks.They are connected.78Challenge:Document Scoring across Different Tasks9

Relevantor NotDocument Scoring Principle

QueryDocument

10Document Scoring PrincipleQ1: Which keywords are important for the query?Q2: How the keywords are contained in the doc?The relevance of a document depends on how it contains keywords that are important for the query.

BBill GatesMicrosofttalkTuesdayQueryDocumentKeywords

Use example 10Requirement of Traditional Learning to Rank:Manually Fulfill the Principle In Feature Design11Learning to Rank ModelsRankSVMRankBoostLambdaMARTAbstractionBM25, Language Model, Vector Space, Output Score Document RelevanceSo, a natural idea would be can we rely on learning to rank frameworks to automatically learn how to fulfill the principle? However, surprisingly, we find that traditional learning to rank frameworks fails to do that.

Definitely, they could not learn to fulfill the principle. As alternative, such learning frameworks require designers to fulfill the principle.

Then, why11Difficult to Manually Fulfill the Principle for Noisy Query and Complex Document12 BM25, Language Model, ...Q1: Which are important? Q2: How are contained?Ans: High-IDFAns: TF

BBill GatesMicrosofttalkTuesday

InsufficientWhile, once we want to incorporate more information for addressing these questions, manually designing scorers become very impractical.12Limitation of Traditional Learning to Rank:Leave Heavy Burden to Feature Designers13Learning to Rank ModelsRankSVMRankBoostLambdaMART

Leave the Burden to Feature Designers14Proposal: Feature DecouplingTowards facilitate the efforts of designing features. 14Feature Decoupling-- Towards Facilitating the Feature Design15Document Scoring Principle How the document contains keywords that are important for the query?Q1: Which keywords are important?Q2: How the keywords are contained?The difficulty of ... Lies at the fact that designers have to manually address the two questions in the design of each feature, if we decouple the feature design, such that each question is addressed by one type of feature at a time, the feature should be largely simplified.15Feature Decoupling for Entity Centric Document Filtering16Intra-Feature

BBill GatesMicrosofttalkTuesday

Meta-FeatureGeneral: IDF, IsNoun, InEntity, ...Structural: PositionInPage, InInfobox, InOpenPara, ...Different Position: TFInURL, TFInTitle, ...Different Representation: LogTF, NormalizedTF, ...16Feature Decoupling for Cross-Domain Sentiment Analysis17Intra-FeatureMeta-FeatureDifferent Position: TFInURL, TFInTitle, ...Different Representation: LogTF, NormalizedTF, ...

GoodBadBoringInterestingTediousHigh QualityLeakingBrokenPivot Keywords17To Learn Ranker given Decoupled Features, the model should1. Recouple Features;18Document RelevanceKeyword ContributionDocument Scoring Principle The relevance of a document depends on how it contains keywords that are important for the query?To Learn Ranker given Decoupled Features, the model should1. Recouple Features;19Document Scoring Principle The relevance of a document depends on how it contains keywords that are important for the query?Intra-FeatureMeta-FeatureRecoupling20To Learn Ranker given Decoupled Features, the model should1. Recouple Features;2. Be Noise Aware.Document Scoring Principle The relevance of a document depends on how it contains keywords that are important for the query?

QueryNoisy KeywordslistMexicoJeffNoise-AwareRequirement for Noise-Aware Recoupling: Inferred Sparsity21

QueryDocumentNoisy KeywordslistMexicoJeffHave a lot of overlaping keywords, most of them are noisy. ...21To Achieve Inferred Sparsity:Two-Layer Scoring Model22Keyword ClassifierInferred Sparsity for noisy keywordsContribution for important KeywordsRealize such a Two-Stage Scoring Model is Non-Trivial23Keyword ClassifierInferred Sparsity for noisy keywordsContribution for important Keywords

24Solution: Tree-Structured Restricted Boltzmann MachineOverview of Tree-Structured Restricted Boltzmann Machine (T-RBM)25262728Learning Feature Weighting by Likelihood Maximization29MaximizeCompute the Gradient byBelief Propagation30ExperimentDatasets for Two Different Applications31Entity-Centric Document FilteringDataset: TREC-KBA29 person entities, 52,238 documentsWikipedia pages as ID pages

Cross-Domain Sentiment AnalysisUse Dataset released by Blitzer et al.4 domains, 8000 documents.

T-RBM Outperforms Other Baselines on Both Applications32Traditional learning to rank/classify frameworks without Feature Decoupling. Use a simple linear weighting model to combine meta-features and intra-features.Use a boosting framework to combine meta-features and intra-features.Structural Correspondence Learning (SCL), the domain adaptation model proposed by Blitzer et al.Summary33Propose to solve learning to rank and domain adaptation as a unified cross-task document scoring problem.Propose the idea of feature decoupling to facilitate feature design.Propose a noise-aware T-RBM model to recouple the features.34Thanks!Q&A34