joint enhancement of topic modeling and information network mining
DESCRIPTION
Joint Enhancement of Topic Modeling and Information Network Mining. Mid-Year PI Report focusing on I3.2 Heng Ji City University of New York NSCTA/INARC. INARC Project Major Contributions. I3.2-Subtask 1: - PowerPoint PPT PresentationTRANSCRIPT
March 24, 2011
Joint Enhancement of Topic Modeling and Information Network Mining
Mid-Year PI Report focusing on I3.2Heng Ji
City University of New YorkNSCTA/INARC
INARC Project Major Contributions I3.2-Subtask 1:
– Disambiguate objects with rich semantic structures extracted from interconnected texts (ACL2011)– A new Collaborative Network Ranking Theory for Coreference Resolution (EMNLP2011-sub): – Markov Logic Networks and Learning-to-Rank to Enhance Open Domain Role Discovery (TAC2010,
LNCS, SIGIR2011-sub, EMNLP2011-sub)– 16.4% improvement over state-of-the-art entity linking and 13%-22% improvement over link
discovery I3.2-Subtask 2 (with H. Deng (UIUC) and J. Han (UIUC); Focus of this Talk)
– Novel topic modeling: Multi-typed objects are treated differently along with their inherent textual information and the rich semantics of the heterogeneous information network (KDD2011-sub, IEEE Journal invited-sub)
– Exploit the power of extended topic modeling for event network partitioning and refinement through active learning and topic cluster driven inferences. (ACL2011-sub, IEEE Journal invited-sub)
– Model the dynamics of information networks through a new temporal event network representation theory, evaluation metric and corresponding kernel methods (ACL2011-sub, EMNLP2011-sub)
I3.2-Subtask 3 (with H. Deng (UIUC) and J. Han (UIUC))– Self-Boosting Terrorism Network Search and Browsing (Springer Book Chapter, SIGIR2011-sub)
*I3.1: Uncovering Hierarchical Relationships among Linked Objects (with C. Wang (UIUC) and J. Han (UIUC), KDD'11 sub, presented by J. Han)
CUNY Students and Post-docs: Q. Li, X. Li, W. Lin, Z. Chen, S. Tamang, S. Anzaroot, J. Artiles 2
Mining and Modeling Interconnected Information Networks
Text-rich heterogeneous information network– Textual documents (news, blogs, twitter, papers, reports)
are getting richer Approximately 80% percent of all data in information network is held in an unstructured
format; Thousands of "attack" events and hundreds of "arrest" events can be mined from one week's unstructured textual data
Identify topics and events from documents using topic models– Interconnect with users and other objects
How topics propagate from documents to objects?
3
A Starting Point: ‘Isolated’ Information Network
Residence: Tahrir (Feb 18th, 2001-present)Residence: Tahrir (Feb 18th, 2001-present)
Website: We are all Khaled Said
Node Pair in InfoNet PER GPE ORG Link Types in Open-domain Information Network
Static
Spouse, Parents, Children, Siblings
Member
Birth-Place, Death-Place, Nationality, Origin
Subsidiaries, Parents
Location, Headquarter, Political-Affiliation
Located-Country, Capital
Dynamic
Contact-Meet, Contact-Phone_Write, Justice, Sport
Leader, Schools-Attended, Employee, Founder, Shareholder, Justice
Resides-Place, Leader, Conflict-Attack, Conflict-Demonstrate, Justice, Movement-Transport, Injure
Business-Merge, Sport, Transaction
Conflict-Attack
Fundamental Theory: InforNet construction and knowledge discovery capability can be mutually enhanced by network analysis on text and interconnected data
Q1: How to discover latent topics and identify clusters of multi-typed objects simultaneously? A1: Probabilistic Topic Modeling with Biased Propagation to take advantage of inter-
connectivity in InforNets Q2: How can text data and heterogeneous InforNet mutually enhance each other in topic
modeling and other text mining tasks? A2: Incorporate topic clusters to partition and refine InforNets, yield new representation,
evaluation metric and modeling theory
Topic modelBiased propagation
Joint Enhancement of Topic Modeling and Heterogeneous Information Network Mining
Preliminaries
6
– Maximize the log likelihood of a collection of docs
Probabilistic Topic Models with Biased Propagation
Intuition: InforNet provides valuable informationDifferent objects have their own inherent information (e.g., D with rich text and U without explicit text) To treat documents with rich text and other objects without explicit text in a different way Topic(D) inherent text + connected U Topic(U) connected D
7
Basic Idea: (Biased Topic Propagation) Propagate the topic probabilities obtained by topic models from
documents to other objects through the heterogeneous InforNet A simple and unbiased topic propagation does not make much sense
Biased Random Walk Basic criterion
– The topic of an object without explicit text depends on the topic of the documents it connects
E.g., the research topic of an author could be characterized by his/her published papers;
– The topic of a document is correlated with its objects to some extent, and should be principally determined by its inherent content of the text
8
The topic distribution of an object is determined by the average topic distribution of connected documents
Inherent topic distributions of docs
Propagated topic distribution
ξ: control the balance between inherent topic distribution and the propagated topic distribution
Biased Regularization: Put All Together
9
Fundamental Theory: InforNet construction and knowledge discovery capability can be mutually enhanced by network analysis on text and interconnected data
Q1: How to discover latent topics and identify clusters of multi-typed objects simultaneously? A1: Probabilistic Topic Modeling with Biased Propagation to take advantage of inter-
connectivity in InforNets Q2: How can text data and heterogeneous InforNet mutually enhance each other in topic
modeling and other text mining tasks? A2: Incorporate topic clusters to partition and refine InforNets, yield new representation,
evaluation metric and modeling theory
Topic modelBiased propagation
Joint Enhancement of Topic Modeling and Heterogeneous Information Network Mining
√
TMBP for InforNet Partitioning
Putin
weapons
nuclear
talks
forces
troops
army
militaryBritish
AFPmillion
government
dollars
convicted
billion company
court
sentence
Event Type: "Contact"Trigger: talk, meet etc.Arguments: "Entity" "Instrument" "Place" "Time-Within"
Event Type: "Business"Trigger: form, dissolve Arguments: "Org""Place" "Time-Within" "Agent"
Event Type: "Attack"Trigger: blew, attack Arguments: "Attacker" "Target" "Place" "Time-Within"
EventType:"Transaction"Trigger: Borrow, LaunchArguments: "Giver" "Recipient""Money""Seller""Artifact""Buyer"
………
………
Pyongyang
China
officials
Washington
north
southKorea
program
United States
Saddam
control
fighting
city
Baghdad
Iraqi
regime
Kurdish
York
case
media
Event Type: "Justice"Trigger: Arrest, JailArguments:"Defendant" "Time-Within" "Adjudicator" "Place"
Doc 1
Doc 3
Doc 4
Doc 6
Doc N
………
Doc 2
Doc 5
TYPE="Contact"
SUBTYPE="Meet"
the leaders of Germany and France
Saint Petersburg
T Y P E = " P H Y S "
S U B T Y P E = " L o c a t e d "
T Y P E = " P E R - S O C "
S U B T Y P E = " B u s i n e s s "
US President
George W. Bush
TYPE="Per
sonnel"
SUBTYPE="Elect"
former Chinese president
Jiang Zeming
T Y P E = " P E R - S O C "
S U B T Y P E = " B u s i n e s s "
Evian, France
TYPE="Movement"
SUBTYPE="Transport"
March 2004
Russian President
Vladimir Putin
Cluster 1 Cluster 2 Cluster 3 Cluster 4
PalestinianIsraelpoliceIsraelipeoplebank
Mondaykilledwest
securityAttack
Iraqwar
UnitedStatesBush
NationsIraqi
ministercouncil
resolutioncountry
northnuclearKorea
weaponsKorean
talksofficials
WashingtonPutinsouthChina
courtdollarsyear
appealmillionyears
governmentconvicted
billionsentence
AFP
Across a heterogeneous information network, a particular object can sometimes be an event trigger and sometimes not, and can represent different event types
Within a cluster of topically-related documents, the distribution is much more convergent
e.g. In the overall information networks only 7% of “fire” indicate “End-Position” events; while all of “fire” in a topic cluster are “End-Position” events
Topic Modeling can enhance information network construction by grouping similar objects, event types and roles together
TMBP for InforNet Refinement
13
Bombing Threats Tracking and Dynamic Terrorism Networks Construction
– Most information obtained from text-rich InforNet construction so far is viewed as static, ignoring the temporal dimension of many links in the networks
– It’s not enough to rely on information reporting time (publication years, blog post dates, news release time, narrative order, etc.) for open-domain real-world scenarios – only 3.71% correlation with gold-standards
– Temporal information on individual documents can be sparse, incomplete and inaccurate. About 50% events don’t include explicit time arguments
13
Open-domain Progressive Information Network Analysis with TMBP
Ali Larijani
IranSupreme National Security Council
2005 – 2007 Islamic Republic of Iran Broadcasting
2005 – 2007
Farideh Motahari
1978-Tehran University Hassan
Rowhani
1989-2005
School-a
ttended
spouseem
ployee employee
employee
1982–1987
0.9 0.3
0.8
0.4
0.6
Toward deep analysis and global aggregation across information networks– Partition Infornet based on topic modeling
– Within a topic cluster, we can recover temporal information by gleaning knowledge across networks and reach a global estimation of time boundaries
Research Methods– Novel representation of complex temporal information
– Meaningful comparison of approaches through InforNet-specific metrics
– Design novel dependency path based kernel methods to capture long contexts
– Global inference and aggregation over text-rich InforNet in order to reduce vagueness and over-constraining, resolve contradiction, and improve information quality
TMBP based Information Aggregation
4-tuple representation– T1=Earliest possible start/ T2=latest possible start /T3= Earliest possible
end /
T4=latest possible end
– Can represent punctual start/end points (T1 = T2, T3 = T4)
– Captures uncertainty when necessary (T1 < T2, T3 < T4)
– Consistency restrictions: T1 <= T2, T3 <= T4, T1<=T3, T2<=T4
A new quality of information metric based on formal constraints:– Detect cases of non-informative nodes and links in information networks– Allow independent parameterization of vagueness and over-constraining
errors– Error penalization can be tuned for more coarse or fine grained penalization– ti: automatic output; gi: gold-standard
New Representation Theory and Evaluation Metric
, ( {1,3} ) ( {2,4} )
,
overconstraining i i i i
vagueness
c if i t g i t gc
c otherwise
Over-constraining model
Vague model
Dependency Paths based Kernel Method and Information Aggregation with CCMs
Dependency paths based kernel method for local network prediction
Maximize global network quality by aggregating temporal information across documents over the entire information networks, using Conditional Constraint Models for optimization (Collaboration with Dan Roth (UIUC))
–
( ) ( ) ( ) ( ) ( )1 1 2 2 3 3 4 4max( , ),min( , ),max( , ),min( , )i i i i iT T T T T T T T T T
, ,max (ln( ) ) . .i k i ki k
p x s t
( ) ( ), , , :i ji j i j and T conflicts withT
1i kx x , ,1
, : {0,1}, 1K
i k i kk
i k x x
Topic Modeling Experiments Compared to State-of-the-art
Data Collection– DBLP– NSF-Awards
Metrics– Accuracy (AC)– Normalized mutual information (NMI)
Results: improve 20%-40% over Probabilistic Latent Semantic Analysis (PLSA)
17
18
Topic Modeling based Active Learning for Event and Role Mining (Enhance Portability)
Data: open-domain news with gold-standard information annotation
Learning algorithm: combining pattern matching and Maximum Entropy based classification of triggers, arguments and roles
Automatically select topically-related documents as for event training data annotation
Using Topic modeling, with only 1/4 training data we can achieve comparable performance as passive learning
19
Topic-cluster wide cross-document inference based on Markov Logic Networks (MLN) to enhance event and role mining One trigger sense per topic cluster / One argument role per topic cluster Remove events and roles with low local and cluster-wide confidence Adjust event and role labeling to achieve cluster-wide consistency
Results: Precision (P), Recall (R), F-Measure (F)
Topic Modeling based MLN Inference (Enhance Quality)
Approach Event Discovery (%) Role Discovery (%)
P R F P R F
Baseline 74.1 49.6 59.4 50.4 28.7 36.6
State-of-the-art (Information Retrieval
based Clustering)66.5 67.4 66.9 60.8 32.2 42.1
Topic Modeling 73.3 66.3 69.6 59.4 36.5 45.2
Progressive Temporal Infornet Mining Results
Data– 1.3 million newswire documents and 0.4 million web blogs/forum
documents Overall Comparison with State-of-the-Art
Impact of Information Aggregation
Approach Exploit InforNet Structures? Accuracy Quality1-gram kernel No 54.9 0.662-gram kernel No 56.8 0.673-gram kernel No 56.5 0.66Our Approach Yes 61.5 0.76
20
No InforNetAggregation over 2 tuples Aggregation over 10 tuples
Exploit InforNet
What’s New in Network Science?
21
Previous Approaches Our Approaches
only considered the textual information while ignored the network structures or could merely integrate with homogeneous networks
Declaratively model the inter-connectivity in information networks using probabilistic topic modeling with biased propagation; Multi-typed objects are treated differently along with their inherent textual information and the rich semantics of the heterogeneous information network
analyzed text documents and information networks separately
text data and heterogeneous information network mutually enhance each other in topic modeling and event/role discovery based on information network partitioning and refinement
focused on the analysis of one or a small set of documents
Leverage information redundancy and semantic links across documents in information networks through cross-document aggregation and reasoning; reach global quality optimization in multi-dimensional space (topic, entity, event, time, place)
treated equally static and dynamic information discovered from ambiguous and uncertain information networks
Develop a new temporal event network representation theory and evaluation metric with formal constraints that can account for uncertain temporal ranges, a new kernel method based on dependency paths to capture long contexts
22
Enrich and enhance the quality of information gathering from daily events and trends, and detecting terrorism or other potential threats by exploring unstructured text messages, blogs, twitters, news, reports integrated information networks
Improved information quality has potential of pointing the soldiers and military data analysts to more relevant information, go beyond keyword based Information Retrieval approaches
Multi-facet object search can provide methods for finding groups of soldiers with certain expertise and finding characteristics of enemies that may pose an imminent threat (An example: Web-scale Terrorism Network Search and Browsing)– Developed methods to efficiently trace membership relations, attack/arrest/die
activities and information clusters involving any specific entities– Improve the quality of information by the interconnected network itself (self-
boosting information networks)
22
Potential Army Impact and Technology Transition
Collaborations Within Task:
– With J. Han on subtask 2 and 3, >2 teleconferences every week, frequent teleconferences/emails among students/post-docs, submitted 2 joint research papers (1 SIGIR2011 submission and 1 ACL2011 submission), preparing 3 new joint research papers
– With D. Roth, collaboration on Constrained Conditional Models (I1.1) for Information Aggregation, entity coreference resolution and event extraction
Cross-Task:– With J. Han on I3.1, weekly teleconferences, regular emails,
submitted 1 joint research paper to KDD2011– With T. Huang on I1.1, on multi-media InforNet construction and
utilization, published 2 joint research papers, submitted a joint NSF proposal
Cross-Center:– With S. Parsons (SCNARC and T1.4), on using text-rich information
networks for trust prediction and dynamic social network analysis, co-advising a PhD student
Research Plans for Next Six Months Continue research conducted in the current I3.2 APP
– Explore topic correlation and social correlation from neighbors for improving topic modeling (with Hongbo Deng, Jiawei Han and collaboration with SCNARC)
– Introduce more constraints in cross-link inferences (with D. Roth)– Exploit new graph alignment algorithms for text mining (with X. Yan)– Exploit implicit links for InforNet analysis, such as the response
structures in twitter data– Technology Transition: Apply all of the successful approaches to
military applications, e.g. conduct tight collaborations with ARL (e.g. Dr. Robert Cole) to make terrorism network search engine deliverable; with ARL (Dr. Robert Winkler) on entity coreference resolution; with A. Leung on military data topic and event analysis
Collaborations with researchers in other tasks and networks– I3.1 APP: Continue collaborations with Jiawei Han (UIUC), to extend the
work of uncovering hierarchical relationships to more general relation types, data genres and domains
– Work with Thomas Huang (UIUC, I1.1) on cross-media transfer learning
– Work with Jiawei Han (UIUC, E2.3) on evolution of information networks
– Work with Simon Parsons (T1.4) on automatic social network analysis, and exploit logic reasoning to enhance entity disambiguation and information aggregation
24
A Research Path Ahead to 2012 Next year research planned if funded:
– Effective theories and methods for mining text-rich heterogeneous networks involving social and communication networks
– Leverage topic modeling for improving expert finding (expertise ranking problem) on heterogeneous information network
– Continue to exploit network structures to enhance knowledge discovery and population
– Multi-dimensional, hierarchical abstractive summarization based on information network analysis
– Explore collaborations with information fusion tasks in I1– Explore collaborations with social network and trust
projects on automatic social network construction and mining
– Application of effective theories and methods in military applications
25
Research PapersI3.1 (UIUC+CUNY) C. Wang, J. Han, X. Li, Q. Li, W. Lin, A. Lee, H. Li and H. Ji. 2011. Uncovering Hierarchical
Relationships among Linked Objects: A Probabilistic Modeling Approach. Submitted to KDD2011. I3.2Accepted/Published: Z. Chen, S. Tamang, A. Lee, X. Li, W. Lin, J. Artiles, M. Snover, M. Passantino and H. Ji. CUNY-BLENDER TAC-
KBP2010 Entity Linking and Slot Filling System Description. Proc. TAC2010. H. Li, X. Li, H. Ji and Y. Marton. Domain-Independent Novel Event Discovery and Semi-Automatic Event
Annotation. Proc. PACLIC 2010. H. Ji, R. Grishman. Knowledge Base Population: Successful Approaches and Challenges. Proc. ACL-HLT2011. H. Ji, Adam Lee and Wen-Pin Lin. Information Network Construction and Alignment from Automatically
Acquired Comparable Corpora. Invited book chapter for Building and Using Comparable Corpora. Springer. H. Ji, B. Favre, W. Lin, D. Gillick, D. Hakkani-Tur and R. Grishman. Open-domain Multi-document Summarization
via Information Extraction: Challenges and Prospects. Invited book chapter for Multi-source, Multilingual Information Extraction and Summarisation. Springer.
Submitted (CUNY + UIUC) H. Ji and J. Han. 2011. Web-Scale Knowledge Discovery and Information Extraction. Invited
Paper for IEEE Special Issue on Web-Scale Multimedia Processing and Applications. (CUNY + UIUC) H. Li, H. Ji, H. Deng and J. Han. 2011. Topically Related Data is Better Data: Topic Modeling for
Event Extraction. ACL-HLT2011. (CUNY + UIUC) S. Anzaroot, J. Artiles, H. Ji, H. Deng and J. Han. 2011. Search and Browsing Self-Boosting
Information Networks. SIGIR2011. J. Artiles, Q. Li, E. Amigo and H. Ji. 2011. Leveraging Cross-document Redundancy for Temporal Information
Extraction. EMNLP2011. J. Artiles, E. Amigo, Q. Li and H. Ji. 2011. Evaluating Temporal Information Extraction. ACL-HLT2011 Z. Chen and H. Ji. 2011. Collaborative Ranking: A Case Study in Entity Linking. EMNLP2011. Q. Li, J. Artiles and H. Ji. 2011. Dependency Paths Kernel for Temporal Relation Classification. ACL-HLT2011. S. Tamang and H. Ji. 2011. Learning-to-Rank for Slot Filling System Combination and Assessment. EMNLP2011. Z. Chen, S. Tamang, A. Lee and H. Ji. 2011. A Toolkit for Knowledge Base Population. SIGIR2011. X. Li and H. Ji. 2011. Comment-guided Learning for Automatic Assessment. EMNLP2011.
26
Awards and Keynote Speech Heng Ji. CUNY Chancellor's "Salute to Scholar" Award, November 2010. Heng Ji. National Science Foundation Research Experiences for
Undergraduates, March 2011 Heng Ji, Web-Scale Knowledge Discovery and Population from Unstructured
Data, Keynote Speech ACLCLP 2010 Information Retrieval Conference, December 2010.
Heng Ji. Overview of the TAC2010 Knowledge Base Population Track, Keynote Speech at Web People Search (WePS-3) Conference, September 2010.
Five students received university-wide awards
27
Brief Summary of My Team’s Other Research Work in I3.1 and I3.2
28
Leverage Semantic Information Network to Enhance Entity Coreference Resolution / Entity Identification
Disambiguation
Name Variant Clustering
Apply Graph-cutting based algorithms on semantic information networks9.4% absolute improvement in micro-averaged accuracy
29
30
1cq2cq
3cq4cq 5cq
6cq7cq ( )q
Bo
( )qAo
q0.7
0.4
q
0.30.6
correct rank :
Micro and Macro Collaborative Networks Ranking for Entity and Event Coreference Resolution
Previous methods only focused on the target node and one learning theory itself
Propose a new collaborative network ranking theory which imitates human collaborative learning
Leverage inter-connections among collaborative entities in information networks
Automatic profiling for each node Construct a collaborative network for each
entity based on graph-based clustering Rank multiple decisions from collaborative
entities (micro) and algorithms (macro) based on global prediction
7% absolute improvement in micro-averaged accuracy
On-going CUNY+UIUC work: using topic modeling for entity clustering
30
Khamis Mushait
31 31
Wail Al-Shehri
V3
Markov Logic Networks and Learning-to-Rank to Enhance Open Domain Role Discovery
Waleed Al-Shehri
Abdul Aziz Al-OmariAbdul Rahman Al-Omari
V4
V6V6 V7V7 V8V8
V9V9
V10V10
V11V11V12V12
Wail Al-Shehri
V3
Waleed Al-Shehri
Abdul Aziz Al-OmariAbdul Rahman Al-OmariV4
911 Suspect Terrorist Network
V15
Terrorist Information Network
originmember
Al-Qaeda
V13
sibling
news pageweb blog
twitterforumBoston
V14residence
residence
Mohamed AttaMohamed AttaV16
pilot
pilotSaudi Arabian Airlines
Discovered 26 roles for persons, 16 roles for organizations and 13 roles for locations Markov Logic Networks for Cross-slot and Cross-query reasoning based on InfoNet and textual linkages
to resolve conflictions and predict missing links Weight=15: Weight=100:
Maximum Entropy based Learning-to-rank model to re-rank candidate answers 13%-22% absolute F-measure improvement
(CUNY) Chen et al. "CUNY-BLENDER TAC-KBP2010 Entity Linking and Slot Filling System Description". Proc. TAC2010 and Lecture Notes in Computer Science, 2010
, , ( , ) ( , ) ( ) ( )x y z Ambiguous X Y Textual Linkage Y Z Pilot X Pilot Z Remove X
, , ( , ) ( , ) ( , )x y z Sibling X Y Origin Y Z Origin X Z
Uncovering Hierarchical Relationships among Linked Objects
Parent-child, manager-subordinate, organizational, initiator-follower
DAG underlying tree Data: Nodes, links, labeled trees Jointly Learn the importance of
features and rules (challenge: joint learning)
Infer the tree structures of unlabeled data (challenge: model & feature design)
Develop a general model & summarize typical features w/ uncertain importance Local feature (singleton
potential) Dependency rule (pairwise
potential) Test on two tasks
Uncover family tree structure Uncover online discussion
structure
p1
p2
p3p4
Candidate DA G
v2
v1
v3v4
v1
v4 v2
v3
One possible result
v1
v4
v2
v3
A nother possible result
Inference performance in diff. measures Practical usefulness and generalityOur model > state-of-the-art text mining (2-3X) Does not require many labels for trainingJoint model > two-stage model (5% - 381%) Good adaptability for generalization
Examples of features and rules
(UIUC + CUNY) Chi Wang, Jiawei Han, Xiang Li, Qi Li, Wen-Pin Lin, Adam Lee, Hao Li, Heng Ji, "Uncovering Hierarchical Relationships among Linked Objects: A Probabilistic Modeling Approach", KDD'11 (sub)
Uncovering Hierarchical Relationships among Linked Objects
Using a novel discriminative model CRF-Hier– optimized for joint modeling of tree structure learning and reasoning– 10%-12% higher performance than state-of-the-art
Mohammed bin Awad
bin Laden
Salem
bin Laden
Bakr
bin Laden Abdullah Osama
bin Laden
Osama
bin LadenSaad
bin Laden
Omar Osama
bin Laden
(UIUC + CUNY) Chi Wang, Jiawei Han, Xiang Li, Qi Li, Wen-Pin Lin, Adam Lee, Hao Li, Heng Ji, "Uncovering Hierarchical Relationships among Linked Objects: A Probabilistic Modeling Approach", KDD'11 (sub)
Potential Transition Example: Terrorism Networks Search and Browsing Engine
• In many scenarios, a user may only know information about limited portions of objects or dimensions of links in information networks and thus have difficulty at creating informative queries
• For example, a military data analyst may have a list of famous terrorism organizations without knowing their detailed person member names, but still wish to track activities about these members
Multi-Facet Search in Self-Boosting Information Networks (Example: Terrorism Network Search and Browsing)
Demo Video: http://nlp.cs.qc.cuny.edu/terrorism.m4v
(CUNY + UIUC) Sam Anzaroot, Javier Artiles, Heng Ji, Hongbo Deng and Jiawei Han. 2011. Search and Browsing Self-Boosting Information Networks. SIGIR2011 [SUB]
• Facilitate a military analyst in expert finding and terrorist information search gathering, control and analysis for any given query
• Entity-topic analyzer for self-expansion and self-boosting: Terrorism organization members status of members (die, arrest,...) and information networks associated with each member