Download - Extraction of topic evolutions from references in scientific articles and its GPU acceleration
Extraction ofTopic Evolutions from
References inScientific Articles andIts GPU Acceleration
Tomonari MASADA
Nagasaki Univ.
Masada and Takasu @ CIKM 2012
Atsuhiro TAKASU
NII
Problem & Solution
Extract topic evolutionsfrom linked documents.
Modify LDA by introducinga transition probability matrix.
Masada and Takasu @ CIKM 2012
Modeling topic evolutions
Draw topics from the following distribution:
Masada and Takasu @ CIKM 2012
transitionmatrix
θd +θd θd+ θd+ θd+…+
topic distribution of citing document
topic distribution of citing document
topic distributions of cited documents
topic distributions of cited documents
(Apply the same matrix
for all citing relations.)
(Apply the same matrix
for all citing relations.)
#(cited documents)
・t(1-t)・ ・
"TERESA"Our method "TERESA" extracts
Topic Evolutions from REferences
in Scientific Articles.
We utilize document links to reveal
directed relationship among topics,
not among documents.Masada and Takasu @ CIKM 2012
Preceding worksDirected relationship from a time point to another
[Ren+ ICML08]
Directed relationship from a document to another[Dietz+ ICML07][Nallapati+ ICWSM08][Nallapati+ AISTATS11]
Corpus-wide undirected relationship among topics[Sun+ ICDM09]
Corpus-wide directed relationship among topicsTERESA
Masada and Takasu @ CIKM 2012
GPU Acceleration
Variational Bayesian Inference (VB)embarrassingly parallel [Zhai+ WWW12]
Time complexityO(MK2) for TERESA (cf. O(MK) for LDA)
•K: # topics
•M: # unique doc-word pairs
Masada and Takasu @ CIKM 2012
ExperimentCora dataset (umass)
Masada and Takasu @ CIKM 2012
ExperimentCora dataset (umass)
Masada and Takasu @ CIKM 2012
word sense disambiguationword sense disambiguation
taggingtagging
Gaussian mixtureGaussian mixture
speech recognitionspeech recognition
domain knowledge extractiondomain knowledge extraction
machine translationmachine translationDNA sequence alignmentDNA sequence alignment
MCMCMCMC
time series analysistime series analysis
parsingparsing
Bayesian networkBayesian network
neural networkneural network
IRIR
semantic analysissemantic analysis
SQLSQLinformation integrationinformation integration
join queryjoin query
PAC learningPAC learning
circuit lower boundcircuit lower bound
Future workDevise an inferencewith less approximation.
Apply TERESAto SNS linked documents.
Implementtopic evolution browser.
Masada and Takasu @ CIKM 2012