predicting communication intention in social media

Predicting Communication Intention

in (Enterprise) Social Networks

Charalampos “Harris” Chelmis

Computer Science, University of Southern California

Thanks to: Viktor K. Prasanna, Ming Hsieh Department of Electrical Engineering, USC

Vikram Sorathia, Co-founder & CEO at Kensemble Tech Labs LLP

•All audio is muted.

•If you dialed in, you MUST enter your audio pin to be able to ask questions!

•We recommend that you keep your phone muted, and unmute yourself when you need to ask questions.

•You can view the upcoming seminar schedule at www.milibo.com/talent/events.aspx

Social Networks are Everywhere

2

• , ,

• Movie Networks

• Affiliation/co-authorship networks

• Professional networks

• Friendship networks

• Information networks

• Organizational Networks

• Q&A websites

• Even networks

• Multiple applications

Targeted marketing

Personalization

− Content delivery

Recommendation

− People to connect, items to buy, movies to watch

Law enforcement

− Fraud detection

− Guilt by association

Epidemiology

Information dissemination/propagation

…

• Users interact with one another and content they create and

consume

Rich interactions

− Friendships based on similarity

− Following based on interest

Noisy

Social Network Analysis

3

• Collaboration Enabling Technologies

Multiple communication channels

Spread of timely and relevant information

Search for data and experts

Collaboration Technologies at the Workplace

4

• Main focus on business perspective

Less noisy than online social networks

Q&A

Problem solving

Information seeking

• But also

Assist in breaking barriers

Team building

Knowledge propagation

• Opportunities

Expert identification

− Experts vs. Influencers

Information Flow

Trends

− Technology adoption

− Company focus

Collaboration at the Workplace

5

• More Opportunities

Collective Knowledge

− Generation

− Sharing

Collaborative Knowledge Management

− How do employees work together to complete tasks?

− How does innovation happen?

− Best practices

• Difficulties

Informal interactions

Heterogeneous, unstructured data

How to formally model knowledge?

Collaboration at the Workplace

6

• Descriptive Modeling

Social network analysis

• Predictive modeling

Link prediction

Attribute prediction

• Typically networked data are represented as graphs

Nodes (e.g., users)

Edges

− Social relations

− Interactions

− Information flow

− Similarity

Weight

− Communication frequency

− Communication cost (e.g., distance)

− Reciprocity

− Type of interaction (e.g., family member, friend, or officemate)

Networked Data Modeling

7

• Heterogeneous object and link types

• Both nodes and edges may carry attributes

• Attribute dependencies

Correlation between attribute values and link structure

− e.g. link prediction based on auxiliary information

Correlation among attributes of related nodes

− e.g. collaborative filtering

• Node dependencies

e.g. groups/communities

• Partial observations

e.g. labels

But Networked Data are Very Different than Graphs

8

• Big Data

Billions of users

Billions of connections

Billions of “documents”

• Temporality

Affiliations

Interests

Friendships

• Context

Spatial

Temporal

Topical

• Content multimodality

Text

Multimedia

Networked Data ≠ Graphs

9

• Edges are more than links

Type

− e.g. like vs. comment vs. share

Trust

Sentiment

Strength

Time

Number

• Edges “reveal” something about the relation between nodes

Prior “interaction” to compute similarity


10


11

• Integrated informal communication

• Context sensitive

• Temporal

• External Sources

• Analysis of implicit relations

Holistic Modeling of Complex Networks

12

Multiple collaborative platforms

Multimodal, heterogeneous content

from various sources

Meta-information about content

- Social Algebraic Operations

- Complex mining and analysis

- Correlation of different domains

- Temporal, semantic analysis

con

text

time

con

ten

t

connection

• Directed communication graph G = (V,E)

Node u represents a user

Edge e = (u,v) exists iff user u has sent at least one message to user v

• Input

G0 = (V0,E0), subgraph of G consisting of all nodes in G and a subset of

edges in G

• Output

Ranked list L of edges, not present in G0, such that

Predicting Intention of Communication

13

ELE 0

Output Input

u

G0 u

G1

• Edge semantics:

Conversation between users rather than friendship

•

•

• “What makes people initiate conversations with strangers?”

• “With whom do individuals choose to collaborate and why?”

≠ Link Prediction

14

Contextual – Temporal Properties

Directionality Matters

u1 ≠

m1(u1,u2,g1) m1(u1,u2,g2)

u2 u1 u2

u1 ≠

m1(u1,u2,g1) m1(u2,u1,g1)

u2 u1 u2

• The tendency to relate to people with similar characteristics

status, beliefs, etc.

• Fundamental concept underlying social theories (e.g. Blau 1977)

• Fundamental basis for links in many types of social networks

“Similar” nodes tend to cluster together

• How does this helps us solve our problem?

Homophily

15

• Machine learning

Probabilistic, supervised, computationally expensive

• Node attributes

No semantics

We instead exploit multiple features of variable types

• Network structure

How to Compute Similarity?

16

Graph Distance Length of shortest path between u and v

Common Neighbors

Jaccard Coefficient

Adamic/Adar

Preferential Attachment

Katz

Random walks

)()( vu

)()(

)()(

vu

vu

)()( )(log

1vuz z

)()( vu

1 ,

vupaths

• If there is a tie between x and y and one between y and z, then in

a transitive network x and z will also be connected

• Such structural clues have been traditionally used for link

prediction

• Consider what happens if edge semantics change

• Or if we further include context

Transitivity

17

x

y

z

x

y

z asks ?

Communication Network

18

Threaded Discussion

Bipartite Graph

Post-Reply Network

Augmented, Directed Post-Reply Network

19

• We model a user as a union of her:

connections and

her content

• We characterize microblogs using a set of attributes

each feature according to its type

Textual Features

− raw textual content (bag-of-words)

− #hashtags

− Groups

Temporal Features

− Date

− Time

• WordNet: enrich concepts with conceptually, semantically and

lexically related terms

Synonyms

Hypernyms

Hyponyms

User Representation

20

• Semantic Similarity of textual concepts

Jaccard Index:

Synonym-based similarity:

Hypernym-based similarity:

Hyponym-based similarity:

• Calculate Semantic Similarity using weighted sum

Semantic Similarity of Textual Features

21

|SS|

|SS| )S ,s(S b) s(a,

ba

baba

)S ,(Ss b) (a,s bass

)S ,(Ss b) (a,s bahh

)S ,(Ss b) (a,s bahphp

• Caveat: concepts belong to the same subtree

Solution: compute similarity between the union of annotations

• Account for lexical similarity: Levenshtein similarity

• Select the highest similarity, either semantic or lexical

Semantic Similarity of Textual Features

22

)HpHS ,HpHs(S

b), (a,s w b) (a,s w b) (a,s w

b), y(a,nSimilaritLevenshtei

max b) (a,s

bbbaaa

hphphhsstg

• Textual Similarity between bag-of-words features:

tf.idf weight vector representation

Cosine similarity

• Date Similarity:

• Time Similarity:

• Timestamp similarity:

Feature Similarity

23

otherwise

T

dd

Tdd

d

d

,1

,0

)d ,(ds 21

21

21d

otherwise

T

tt

Ttt

t

t

,1

,0

) t,(ts 21

21

21t

)y ,(xs w )y ,(xs w y) (x, s ttttdddddf

• We use a variation of Hausdorff point set distance measure:

Average of the maximum similarity of features in set A with respect to

features in set B

: any similarity measure on set elements ak and bi

Measure is asymmetric with respect to the sets

Feature Set Similarity

24

),(maxA

1 B)(A,S

A

1i

H ik

k

basim

),( ik basim

• A weighted function of content and network proximity

λ controls the tradeoff between content and network proximity

• Content Proximity

User similarity with respect to their microblogs

Similarity of microblogs

− Combined weighted value of respective attributes similarities

• Network Proximity:

User Similarity

25

)p ,(ps w )p ,(psw)p,(pS w )p ,(ps w )p ,S(p 21dfdf21txtx21Htgg2g1gg21 tgtg

),(maxu

1 )u,(uS 21

u

1i

1

21C

1

ipkp

kp

uuSp

u

vuvus

||),( v)(u,SN

v)(u,)S-(1 v)(u,S v)S(u, NC

Asymmetric with respect to users

• First construct the augmented communication graph G(V,E)

• Given a user u,

compute users similarity

− For all posts of user u with respect to all other users in the network

For all facets

Communication Intention Prediction

26

• Complete snapshot (June 2010 – August 2011) of a corporate micro-

blogging service, which resembles Twitter

4,213 unique users

16,438 messages in total

− 8,174 thread starters

− 8,264 replies

8,139 threads

88 discussion groups

637 unique #hastags

Dataset

27

• In our evaluation we focus on the Largest Connected Component

582 users

3,773 directed edges

11,684 messages

Average degree = 12.97

• Clustering coefficient = 0.2311 >> ccrandom = 0.0223

• Clustering coefficient as a function of node degree

Average clustering coefficient decreases with increasing node degree

Higher for nodes of low degree significant clustering among low-

degree nodes

Dataset

28

Number of Neighbors

• Directed messages received vs. directed messages sent

Scattered across the diagonal

Cumulative distribution of the out-degree to in-degree ratio, exhibits

high correlation between in-degree and out-degree

Tendency of users to reply back when they receive a message from

other users?

29

• Four-fold cross validation

• Randomly sample 100 users & recommend top-k links for each user

• Accuracy measures

Precision@k

Recall@k

MRR

• Baselines

Random

− Random selection

Shared Vocabulary

− Cosine similarity based on #hastags vector

Shortest distance

− Length of the shortest path

Common neighbors

−

Evaluation

30

Sp

k

k

pN

S

)(1

Sp

p

pp

F

RF

S

1

SpprankS

11

)()(v)sim(u, vu

Lexical and Topical Alignment

• Is there a global vocabulary in the corporate microblogging service?

Hashtags vocabulary

“Groups vocabulary”

• Select user pairs at random and measure number of shared tags

Average nst = 1.001

Most common case is the absence of shared tags

• However adjacent users in social networks tend to share common

interests due to homophily

We measure user homophily with respect to hashtags as a function of

the distance of users in the network

• Select user pairs at random and measure number of shared groups

Average nsg = 1

Most common case is the absence of shared groups

31

Lexical Alignment

• Average number of shared (distinct) hashtags for two users as a

function of their distance d along the network:

,

• Shared hashtags vocabulary up to distance 6!

32

22)()(

)()(),(

t vt u

t vu

tags

tftf

tftfvu

)()(tagsUvnun tt

t

t

v

t

u

• Bold indicates best performing baseline

• Percentage lift

the % improvement achieved over the best performing baseline

Methods Comparison

33

• How to choose best values of λ and weighing factors?

• Different datasets may lead to different optimal values

Grid search over ranges of values for these parameters

Measure accuracy on the validation set for each configuration setting

Weight Scheme Selection

34

• 0 only considers network proximity

• 1 only considers content similarity

• All schemes perform better than the baseline

• Good value for λ is approximately 0.8

Effect of Parameter λ

35

• Effect of weighting schemes on accuracy per user

• Different weighting schemes perform better for different users

Features importance is user specific

• Need personalization to achieve better accuracy overall

Effect of Weighting Scheme

36

• Average precision (measured@ 5) of users having k

(a) posts or

(b) neighbors in the communication network

The more statistical evidence the better the overall precision

Content Availability and Structural Proximity

37

• MRR as a function of λ for various restrictions

• Greater statistical evidence results in more accurate predictions

Content Availability and Structural Proximity

38

• Performed modeling and analysis of informal communication at the

workplace

• We introduced the problem of communication intention prediction

• We addressed this problem by exploiting auxiliary information

Holistic modeling of structural clues and semantically enriched

content

• We tested the efficiency of our approach in a real-world dataset

The more statistical evidence available, the more accurate predictions

Need for personalization

• Potential applications

Contextual expert recommendation for Q&A

Search for “interesting” people to collaborate

• Open problems

Scalability

Replication of results for online social media

Conclusion and Open Problems

39

• Semantic Social Network Analysis for the Enterprise

Contextual Recommendation

40

Employee ID:

• Semantic Social Network Analysis for the Enterprise

Instantiate our modeling in Ontology

Collaboration analytics at the workplace

Real-world data evaluation

Contextual Recommendation

41

Contextual ego-network analysis

Expert Identification

Semantic Analysis

• Questions?

• Resources

http://www-scf.usc.edu/~chelmis/index.php

http://pgroup.usc.edu/wiki/CSS

• Please send all inquiries at [email protected]

Thank you!

42




http://pgroup.usc.edu/wiki/CSS

mailto:[email protected]