predicting communication intention in social media
DESCRIPTION
In social networks, where users send messages to each other, the issue of what triggers communication between unrelated users arises: does communication between previously unrelated users depend on friend-of-a-friend type of relationships, common interests, or other factors? In this work, we study the problem of predicting directed communication intention between two users. Link prediction is similar to communication intention in that it uses network structure for prediction. However, these two problems exhibit fundamental differences that originate from their focus. Link prediction uses evidence to predict network structure evolution, whereas our focal point is directed communication initiation between users who are previously not structurally connected. To address this problem, we employ topological evidence in conjunction to transactional information in order to predict communication intention. It is not intuitive whether methods that work well for link prediction would work well in this case. In fact, we show in this work that network or content evidence, when considered separately, are not sufficiently accurate predictors. Our novel approach, which jointly considers local structural properties of users in a social network, in conjunction with their generated content, captures numerous interactions, direct and indirect, social and contextual, which have up to date been considered independently. We performed an empirical study to evaluate our method using an extracted network of directed @-messages sent between users of a corporate microblogging service, which resembles Twitter. We find that our method outperforms state of the art techniques for link prediction. Our findings have implications for a wide range of social web applications, such as contextual expert recommendation for Q&A, new friendship relationships creation, and targeted content delivery.TRANSCRIPT
Predicting Communication Intention
in (Enterprise) Social Networks
Charalampos “Harris” Chelmis
Computer Science, University of Southern California
Thanks to: Viktor K. Prasanna, Ming Hsieh Department of Electrical Engineering, USC
Vikram Sorathia, Co-founder & CEO at Kensemble Tech Labs LLP
•All audio is muted.
•If you dialed in, you MUST enter your audio pin to be able to ask questions!
•We recommend that you keep your phone muted, and unmute yourself when you need to ask questions.
•You can view the upcoming seminar schedule at www.milibo.com/talent/events.aspx
Social Networks are Everywhere
2
• , ,
• Movie Networks
• Affiliation/co-authorship networks
• Professional networks
• Friendship networks
• Information networks
• Organizational Networks
• Q&A websites
• Even networks
• Multiple applications
Targeted marketing
Personalization
− Content delivery
Recommendation
− People to connect, items to buy, movies to watch
Law enforcement
− Fraud detection
− Guilt by association
Epidemiology
Information dissemination/propagation
…
• Users interact with one another and content they create and
consume
Rich interactions
− Friendships based on similarity
− Following based on interest
Noisy
Social Network Analysis
3
• Collaboration Enabling Technologies
Multiple communication channels
Spread of timely and relevant information
Search for data and experts
Collaboration Technologies at the Workplace
4
• Main focus on business perspective
Less noisy than online social networks
Q&A
Problem solving
Information seeking
• But also
Assist in breaking barriers
Team building
Knowledge propagation
• Opportunities
Expert identification
− Experts vs. Influencers
Information Flow
Trends
− Technology adoption
− Company focus
Collaboration at the Workplace
5
• More Opportunities
Collective Knowledge
− Generation
− Sharing
Collaborative Knowledge Management
− How do employees work together to complete tasks?
− How does innovation happen?
− Best practices
• Difficulties
Informal interactions
Heterogeneous, unstructured data
How to formally model knowledge?
Collaboration at the Workplace
6
• Descriptive Modeling
Social network analysis
• Predictive modeling
Link prediction
Attribute prediction
• Typically networked data are represented as graphs
Nodes (e.g., users)
Edges
− Social relations
− Interactions
− Information flow
− Similarity
Weight
− Communication frequency
− Communication cost (e.g., distance)
− Reciprocity
− Type of interaction (e.g., family member, friend, or officemate)
Networked Data Modeling
7
• Heterogeneous object and link types
• Both nodes and edges may carry attributes
• Attribute dependencies
Correlation between attribute values and link structure
− e.g. link prediction based on auxiliary information
Correlation among attributes of related nodes
− e.g. collaborative filtering
• Node dependencies
e.g. groups/communities
• Partial observations
e.g. labels
But Networked Data are Very Different than Graphs
8
• Big Data
Billions of users
Billions of connections
Billions of “documents”
• Temporality
Affiliations
Interests
Friendships
• Context
Spatial
Temporal
Topical
• Content multimodality
Text
Multimedia
Networked Data ≠ Graphs
9
• Edges are more than links
Type
− e.g. like vs. comment vs. share
Trust
Sentiment
Strength
Time
Number
• Edges “reveal” something about the relation between nodes
Prior “interaction” to compute similarity
Networked Data ≠ Graphs
10
Networked Data ≠ Graphs
11
• Integrated informal communication
• Context sensitive
• Temporal
• External Sources
• Analysis of implicit relations
Holistic Modeling of Complex Networks
12
Multiple collaborative platforms
Multimodal, heterogeneous content
from various sources
Meta-information about content
- Social Algebraic Operations
- Complex mining and analysis
- Correlation of different domains
- Temporal, semantic analysis
con
text
time
con
ten
t
connection
• Directed communication graph G = (V,E)
Node u represents a user
Edge e = (u,v) exists iff user u has sent at least one message to user v
• Input
G0 = (V0,E0), subgraph of G consisting of all nodes in G and a subset of
edges in G
• Output
Ranked list L of edges, not present in G0, such that
Predicting Intention of Communication
13
ELE 0
Output Input
u
G0 u
G1
• Edge semantics:
Conversation between users rather than friendship
•
•
• “What makes people initiate conversations with strangers?”
• “With whom do individuals choose to collaborate and why?”
≠ Link Prediction
14
Contextual – Temporal Properties
Directionality Matters
u1 ≠
m1(u1,u2,g1) m1(u1,u2,g2)
u2 u1 u2
u1 ≠
m1(u1,u2,g1) m1(u2,u1,g1)
u2 u1 u2
• The tendency to relate to people with similar characteristics
status, beliefs, etc.
• Fundamental concept underlying social theories (e.g. Blau 1977)
• Fundamental basis for links in many types of social networks
“Similar” nodes tend to cluster together
• How does this helps us solve our problem?
Homophily
15
• Machine learning
Probabilistic, supervised, computationally expensive
• Node attributes
No semantics
We instead exploit multiple features of variable types
• Network structure
How to Compute Similarity?
16
Graph Distance Length of shortest path between u and v
Common Neighbors
Jaccard Coefficient
Adamic/Adar
Preferential Attachment
Katz
Random walks
)()( vu
)()(
)()(
vu
vu
)()( )(log
1vuz z
)()( vu
1 ,
vupaths
• If there is a tie between x and y and one between y and z, then in
a transitive network x and z will also be connected
• Such structural clues have been traditionally used for link
prediction
• Consider what happens if edge semantics change
• Or if we further include context
Transitivity
17
x
y
z
x
y
z asks ?
Communication Network
18
Threaded Discussion
Bipartite Graph
Post-Reply Network
Augmented, Directed Post-Reply Network
19
• We model a user as a union of her:
connections and
her content
• We characterize microblogs using a set of attributes
each feature according to its type
Textual Features
− raw textual content (bag-of-words)
− #hashtags
− Groups
Temporal Features
− Date
− Time
• WordNet: enrich concepts with conceptually, semantically and
lexically related terms
Synonyms
Hypernyms
Hyponyms
User Representation
20
• Semantic Similarity of textual concepts
Jaccard Index:
Synonym-based similarity:
Hypernym-based similarity:
Hyponym-based similarity:
• Calculate Semantic Similarity using weighted sum
Semantic Similarity of Textual Features
21
|SS|
|SS| )S ,s(S b) s(a,
ba
baba
)S ,(Ss b) (a,s bass
)S ,(Ss b) (a,s bahh
)S ,(Ss b) (a,s bahphp
• Caveat: concepts belong to the same subtree
Solution: compute similarity between the union of annotations
• Account for lexical similarity: Levenshtein similarity
• Select the highest similarity, either semantic or lexical
Semantic Similarity of Textual Features
22
)HpHS ,HpHs(S
b), (a,s w b) (a,s w b) (a,s w
b), y(a,nSimilaritLevenshtei
max b) (a,s
bbbaaa
hphphhsstg
• Textual Similarity between bag-of-words features:
tf.idf weight vector representation
Cosine similarity
• Date Similarity:
• Time Similarity:
• Timestamp similarity:
Feature Similarity
23
otherwise
T
dd
Tdd
d
d
,1
,0
)d ,(ds 21
21
21d
otherwise
T
tt
Ttt
t
t
,1
,0
) t,(ts 21
21
21t
)y ,(xs w )y ,(xs w y) (x, s ttttdddddf
• We use a variation of Hausdorff point set distance measure:
Average of the maximum similarity of features in set A with respect to
features in set B
: any similarity measure on set elements ak and bi
Measure is asymmetric with respect to the sets
Feature Set Similarity
24
),(maxA
1 B)(A,S
A
1i
H ik
k
basim
),( ik basim
• A weighted function of content and network proximity
λ controls the tradeoff between content and network proximity
• Content Proximity
User similarity with respect to their microblogs
Similarity of microblogs
− Combined weighted value of respective attributes similarities
• Network Proximity:
User Similarity
25
)p ,(ps w )p ,(psw)p,(pS w )p ,(ps w )p ,S(p 21dfdf21txtx21Htgg2g1gg21 tgtg
),(maxu
1 )u,(uS 21
u
1i
1
21C
1
ipkp
kp
uuSp
u
vuvus
||),( v)(u,SN
v)(u,)S-(1 v)(u,S v)S(u, NC
Asymmetric with respect to users
• First construct the augmented communication graph G(V,E)
• Given a user u,
compute users similarity
− For all posts of user u with respect to all other users in the network
For all facets
Communication Intention Prediction
26
• Complete snapshot (June 2010 – August 2011) of a corporate micro-
blogging service, which resembles Twitter
4,213 unique users
16,438 messages in total
− 8,174 thread starters
− 8,264 replies
8,139 threads
88 discussion groups
637 unique #hastags
Dataset
27
• In our evaluation we focus on the Largest Connected Component
582 users
3,773 directed edges
11,684 messages
Average degree = 12.97
• Clustering coefficient = 0.2311 >> ccrandom = 0.0223
• Clustering coefficient as a function of node degree
Average clustering coefficient decreases with increasing node degree
Higher for nodes of low degree significant clustering among low-
degree nodes
Dataset
28
Number of Neighbors
• Directed messages received vs. directed messages sent
Scattered across the diagonal
Cumulative distribution of the out-degree to in-degree ratio, exhibits
high correlation between in-degree and out-degree
Tendency of users to reply back when they receive a message from
other users?
29
• Four-fold cross validation
• Randomly sample 100 users & recommend top-k links for each user
• Accuracy measures
Precision@k
Recall@k
MRR
• Baselines
Random
− Random selection
Shared Vocabulary
− Cosine similarity based on #hastags vector
Shortest distance
− Length of the shortest path
Common neighbors
−
Evaluation
30
Sp
k
k
pN
S
)(1
Sp
p
pp
F
RF
S
1
SpprankS
11
)()(v)sim(u, vu
Lexical and Topical Alignment
• Is there a global vocabulary in the corporate microblogging service?
Hashtags vocabulary
“Groups vocabulary”
• Select user pairs at random and measure number of shared tags
Average nst = 1.001
Most common case is the absence of shared tags
• However adjacent users in social networks tend to share common
interests due to homophily
We measure user homophily with respect to hashtags as a function of
the distance of users in the network
• Select user pairs at random and measure number of shared groups
Average nsg = 1
Most common case is the absence of shared groups
31
Lexical Alignment
• Average number of shared (distinct) hashtags for two users as a
function of their distance d along the network:
,
• Shared hashtags vocabulary up to distance 6!
32
22)()(
)()(),(
t vt u
t vu
tags
tftf
tftfvu
)()(tagsUvnun tt
t
t
v
t
u
• Bold indicates best performing baseline
• Percentage lift
the % improvement achieved over the best performing baseline
Methods Comparison
33
• How to choose best values of λ and weighing factors?
• Different datasets may lead to different optimal values
Grid search over ranges of values for these parameters
Measure accuracy on the validation set for each configuration setting
Weight Scheme Selection
34
• 0 only considers network proximity
• 1 only considers content similarity
• All schemes perform better than the baseline
• Good value for λ is approximately 0.8
Effect of Parameter λ
35
• Effect of weighting schemes on accuracy per user
• Different weighting schemes perform better for different users
Features importance is user specific
• Need personalization to achieve better accuracy overall
Effect of Weighting Scheme
36
• Average precision (measured@ 5) of users having k
(a) posts or
(b) neighbors in the communication network
The more statistical evidence the better the overall precision
Content Availability and Structural Proximity
37
• MRR as a function of λ for various restrictions
• Greater statistical evidence results in more accurate predictions
Content Availability and Structural Proximity
38
• Performed modeling and analysis of informal communication at the
workplace
• We introduced the problem of communication intention prediction
• We addressed this problem by exploiting auxiliary information
Holistic modeling of structural clues and semantically enriched
content
• We tested the efficiency of our approach in a real-world dataset
The more statistical evidence available, the more accurate predictions
Need for personalization
• Potential applications
Contextual expert recommendation for Q&A
Search for “interesting” people to collaborate
• Open problems
Scalability
Replication of results for online social media
Conclusion and Open Problems
39
• Semantic Social Network Analysis for the Enterprise
Contextual Recommendation
40
Employee ID:
• Semantic Social Network Analysis for the Enterprise
Instantiate our modeling in Ontology
Collaboration analytics at the workplace
Real-world data evaluation
Contextual Recommendation
41
Contextual ego-network analysis
Expert Identification
Semantic Analysis
• Questions?
• Resources
http://www-scf.usc.edu/~chelmis/index.php
http://pgroup.usc.edu/wiki/CSS
• Please send all inquiries at [email protected]
Thank you!
42