intelligent database systems lab presenter: wu, min-cong authors: abdelghani bellaachia and mohammed...
TRANSCRIPT
![Page 1: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
Presenter: WU, MIN-CONG
Authors: Abdelghani Bellaachia
and Mohammed Al-Dhelaan
2012, WIIAT
NE-Rank: A Novel Graph-based Keyphrase Extraction in Twitter
![Page 2: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
1
![Page 3: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
Motivation• When used in text to represent a lexical graph,
it is possible to include a weight for the words
that will measure the ranking more accurately
instead of only relaying on the co-occurrence in
Twitter.
2
![Page 4: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
Objectives• In task of extracted topical keyphrase, we start by
proposing a novel unsupervised graph- based keyword
ranking method, called NE-Rank, that considers word
weights in addition to edge weights when
calculating the ranking.
3
![Page 5: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
Methodology-System Overviewθ
Twitter set
documentk1k2k3k4
topical subdatsets
NE-Rank
Hashtags Titles
candidate keyphrase
4
![Page 6: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
Methodology- Topic Extraction
5
documentk1k2k3k4
topicw1w2w3w4w5w6
For this papertopic
w1w2w3w4w5w6
documentk1k2k3k4
![Page 7: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
Methodology- Topic Extraction
Problem
Top 5 Search TF-IDFTop 10 terms
Insert
6
![Page 8: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab7
Methodology- Graph-based Keywords Ranking extant approach
PageRank
TextRank
c
c
6
Summary
target Edge weiget Node weiget
PageRank web non-consideration non-consideration
TextRank word consideration non-consideration
![Page 9: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab8
Methodology- Graph-based Keywords Ranking proposing approach
NE-Rank
Summary
target Edge weiget Node weiget
PageRank web non-consideration non-consideration
TextRank word consideration non-consideration
NE-Rank word consideration consideration
![Page 10: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab9
Methodology- Hashtags Titles
Hashtags titles
topical dataset
wordusing an English dictionary with frequencies.
Strengthening Strategy
in-degreeBoosted 5%
extract
split
record
![Page 11: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
Methodology- Candidate Keyphrase Generation
positions
keyphrase
1. magnment2. business3. customer4. staff5. finance
Information magnment
descending order
find
find
Twitter set
1. magnment2. business3. customer4. staff5. finance
10
![Page 12: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
Methodology- Keyphrase Ranking
keyphrases
phrases list
score
filtering
summarize
hashtags Usage
Another study is measuring sentiment in hashtags. Usage of hashtags as keywords annotation makes them of a very interest to our work.
less than 5 times
11
![Page 13: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
Experiment- Dataset and Preprocessing
12
tweets tokens hashtags Hashtags frequency
Twitter set 31,227 244,139 4,079 40,674
Dataset
Preprocessing
remove non-english
Remove flag Ex: URL. emoticons. smileys
transform slangs and abbreviation
English dictionary
Vocabulary OOV
POS tagger
removed stopwords
LDA
500 iterations30 topics
![Page 14: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
Experiment- Evaluation Metrics
13
Precision
Bpref
![Page 15: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
Experiment- Results
14
![Page 16: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/16.jpg)
Intelligent Database Systems Lab
Experiment- Results
15
![Page 17: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/17.jpg)
Intelligent Database Systems Lab
Conclusions• The potential and validity of both approaches have
been demonstrated by conducting an experimental
evaluation.
16
![Page 18: Intelligent Database Systems Lab Presenter: WU, MIN-CONG Authors: Abdelghani Bellaachia and Mohammed Al-Dhelaan 2012, WIIAT NE-Rank: A Novel Graph-based](https://reader030.vdocuments.us/reader030/viewer/2022032516/56649c7d5503460f949329cc/html5/thumbnails/18.jpg)
Intelligent Database Systems Lab
Comments• Advantages
– keyphrase score not only rely on the co-occurrence.
• Applications– Automatic Keyphrase Extraction.
17