a language modeling framework for expert finding

15
Intelligent Database Systems Lab N.Y.U.S. T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog, Leif Azzopardi, Maarten de Rijke Information Processing and Management (IPM) 45 (2009) 1–19

Upload: garvey

Post on 05-Jan-2016

22 views

Category:

Documents


2 download

DESCRIPTION

A language modeling framework for expert finding. Presenter : Lin, Shu -Han Authors : Krisztian Balog , Leif Azzopardi , Maarten de Rijke. Information Processing and Management (IPM) 45 (2009) 1–19. Outline. Motivation Objective Methodology Experiments Conclusion Comments. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.

A language modeling framework for expert finding

Presenter : Lin, Shu-Han

Authors : Krisztian Balog, Leif Azzopardi, Maarten de Rijke

Information Processing and Management (IPM) 45 (2009) 1–19

Page 2: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.

2

Outline

Motivation Objective Methodology Experiments Conclusion Comments

Page 3: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Motivation

The expert finding: finding experts given a topic. Yellow Pages:

Profiles: employees self-assess their skills.

Keywords; e.g., marketing

Problem: Information: antiquated

Keywords: restricted

3

Page 4: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Objectives

Within the organization… Mine published intranet documents.

Search all kinds of expertise.

‘Who are the experts on topic “Internet marketing and internet advertising” in my organization?’

4

Page 5: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Methodology – Overview

To capture the association between a candidate expert and an area of expertise…

“What is the probability of a candidate ca being an expert given the query topic q?”

Model 1: candidate-based (query-independent) approach:

idea: build a profile of candidate experts, and rank them based on query.

Model 2: document-based (query-dependent) approach

idea: find the query-relevant documents, then associate with experts.

5

(constant)Bayes’ Theorem

(uniform)

Page 6: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Methodology – Model 1

Build a textual representation (model) of a person’s knowledge according to his documents.

Then estimate the probability of the query given the candidate’s model.

6

e.g., p(Internet marketing and internet advertising| θca)=p(“Internet”| θca)2 p(“Marketing”| θ‧ ca)

p(“and”| θ‧ ca) p(“Advertising”| θ‧ ca)

(Smoothed)

(weighted)

Page 7: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Methodology – Model 1B

Estimate p(t | d, ca) Candidate identifier

Window size (w)

7

e.g., p(“Internet”| “Mail.No.43”, “John”)… John ([email protected]) is a major in marketing. … … <731842> (< 731842 >) is a major in marketing. …

p.s. the closer, the more powerful.

(weighted)

Page 8: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Methodology – Model 2

8

(Smoothed)

Page 9: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Methodology – Model 2B

Model 2

Model2B

9

Page 10: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.

Methodology – document-candidate associations

Boolean model

TF-IDF

10

(document importance) (senior member of organization)

Page 11: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Experiments

Evaluation measures: MAP (mean average precision)

MRR (mean reciprocal rank):

11

Query Results Correct response Rank Reciprocal rank

cat catten, cati, cats cats 3 1/3

torus torii, tori, toruses tori 2 1/2

virus viruses, virii, viri viruses 1 1

(1/3 + 1/2 + 1)/3 = 11/18

Page 12: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Experiments

Model 1 vs. Model 2

Window-based models

12

Page 13: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.Experiments

Association methods

Parameter sensitivity

13

Page 14: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.

14

Conclusions

Model 1: build a profile of candidate experts, and rank them based on query.

Model 2: find the query-relevant documents, then associate with experts.

Model 2 was to be preferred over Model 1: Effectiveness: in terms of average precision and reciprocal rank

Implement: only requiring a regular document index

window-based extensions improved : Effectiveness: especially on top of Model 1

Frequency-based (TF-IDF) document-candidate associations is helpful.

Page 15: A language modeling framework  for  expert finding

Intelligent Database Systems Lab

N.Y.U.S.T.I. M.

15

Comments

Advantage Integrate ideas

Drawback …

Application …