adversary social networks: community formation, detection and consensus engineering€¦ · ·...

Boleslaw Szymanski Social Cognitive Networks

Academic Research Center at Rensselaer

Adversary Social Networks: Community Formation, Detection

and Consensus Engineering

http://www.cs.rpi.edu/~szymansk

Goals: Discovery and Analysis of Social Communities

Statistical processing of social interaction data

Vast

Stochastic

Dynamic

Find communities, especially hidden communities.

2

Communications supporting IED planning have

patterns and are correlated….

Analysis of the patterns can reveal the groups as well

as their internal group structure.

3

Application: Finding and Tracking Groups Hidden in Time and Space

Why Statistical Analysis

• Agents often do not know/declare their communities.

• When they do it is largely false:– Blogs (declared friend-links and communities)

• Declared friend-links predict communications with 0.04 F1-measure (small)

• Declared communities predict communications with 0.0002 F1-measure (even smaller)

4

Data: Large Dynamic Social Networks

Examples:• LiveJournal Blog network

– Very large, dynamic social network• Twitter• Email, Chatrooms• Data repository for large scale dynamic

social networks

5

System Components

F

A

C

E G

D HI

B

SIGHTS

b u y , t r a d e . . . b u y

2 t r a d e . . . 2 t r a d e

3 , s e l l . . . 3 , h e l l

Pattern id = 2Pattern = “buy,”

Pattern id = 3Pattern = “2trade”bb

Level 0

Level 1

Level 2

RDM-CSA

Higher ranked leaders

Group leader

Subgroup leaders

Members 6

Communications

7

Time: January 12, 2010, 09:35

From: [email protected]

To: [email protected]: Hello

Message: Where have you been?

16:06:31] <FreeTrade> Republicans were the worst pacifists before ww1 and ww2[16:06:43] <SweetLeaf> France Fries

[16:06:50] <FreeTrade> As a generality, of course their were Republican Hawks.[16:07:13] <FreeTrade> Sweet, good pun but bad story!

[16:07:18] <SweetLeaf> yup[16:07:23] <Lupine> anyways, he's perpetually tormented by presidential actions

[16:07:25] <SweetLeaf> it aint good for no one[16:07:47] <SweetLeaf> I think they knew it was commiing

[16:07:51] <FreeTrade> Rossevelt met monthly in New York with mostly trusted Republicans to talk about how to get america into the war.

[16:08:10] <FreeTrade> and he spent 2 year with Churchill meeting him sometimes secretly in the ocean to discuss the same topic.

[16:08:22] <FreeTrade> Exchanging a lot of letters.[16:08:25] <FreeTrade> telegrams

[16:08:28] <Lupine> There really is nothing like a shorn scrotum. It's breathtaking, I suggest you try it.

[16:08:55] <FreeTrade> Well they didnt literally meet in the ocean, they were on ships.

Streaming Example

Time From To Message10:00 Alice Charlie Golf tomorrow? Tell everyone.10:05 Charlie Felix Alice mentioned golf tomorrow.10:06 Alice Bob Hey, golf tomorrow. Spread the word.10:12 Alice Bob Tee off: 8am at Pinehurst.10:13 Felix Grace Hey guys, golf tomorrow.10:13 Felix Harry Hey guys, golf tomorrow.10:15 Alice Charlie Pinehurst Tee time: 8am.10:20 Bob Elizabeth We’re playing golf tomorrow.10:20 Bob Dave We’re playing golf tomorrow.10:22 Charlie Felix Tee time 8am at Pinehurst10:25 Bob Elizabeth We tee off 8am at Pinehurst.10:25 Bob Dave We tee off 8am at Pinehurst.10:31 Felix Grace Tee time 8am, Pinehurst.10:31 Felix Harry Tee time 8am, Pinehurst.

8

Streaming Example

Time From To10:00 Alice Charlie10:05 Charlie Felix10:06 Alice Bob10:12 Alice Bob 10:13 Felix Grace10:13 Felix Harry10:15 Alice Charlie 10:20 Bob Elizabeth10:20 Bob Dave10:22 Charlie Felix10:25 Bob Elizabeth 10:25 Bob Dave10:31 Felix Grace10:31 Felix Harry

A

C

F

HG

B

D E

9

Communication graph

10What are the social groups/coalitions?

January 12, 2010, 09:35

[email protected]@xyz.com

Social groups are overlapping clusters

• Clusters may overlap.

11

Social groups are overlapping, introverted clusters

• Clusters may overlap– A cluster is a locally defined object in which group

members are more introverted than extroverted

12

YES

NO

Social groups are persistent overlapping, introverted clusters

• Clusters may overlap• A cluster is a locally defined object in which

group members are more introverted than extroverted.

• Social groups (clusters) persist

13

Examples

GROUND TRUTH• Group A

– Dog– Vulture– Camel– Yassir Hussein– Bird– (6 others)

• Group B– Ahmet – Saleh Sarwuk– Shaid– Pavlammed Pavlah– Osan Domenik

14

Two clusters: Electric circuit design; Optimization of Neural Networks:

Intersection: “Sensitivity analysis in degenerate quadratic programming”

Citeseer

ENRON

SIGHTSGroup A

Dog Vulture Camel

Gopher

Group BAhmet

Saleh Sarwuk ShaidDajik

Ali Baba Data Set (DoD)

SIGHTSStatistical Identification of Groups Hidden in Time and Space

- System for statistical analysis of social coalitions in communication networks

Data SourcesBlogs

TwitterEmails (Enron)

ChatroomSynthetic data

Coalition DiscoveryOverlapping Clustering

Streaming groupsPersistent groups.

Coalition AnalysisLeaders

Opposing groupsTopic matching

VisualizationsSize-Density plotsStatic coalitions

Dynamic coalitions

Groups matching analyst topic in red

Size vs. Density Plot

Visualization options

Choose time window

Groupmembers

Different analyses on dataset

Leader index

Spreading of information: Russian Blog: Kosovo, week 40, 2007

Twitter and Trust

• Users: 1,894,215 • Messages per week: 3,750,000

25

Behavioral Trust based on Communications

• Modeling trust:– Define and collect conversations– Define trust based on

conversations• Stability, frequency, depth• Reciprocity• Conversation Propagation

• Model Validation (collaborations):

– David Lazer (Harvard)– Brian Uzzi (Notre Dame)

0 10 20 30 40 50 600

0.5

1

1.5

2

2.5x 10

4

Size

# st

rong

ly “

trus

ting”

com

pone

nts

26

Collective dynamics on the networkExamples:Internet (packet traffic/flux)Load-balancing schemes (job allocation among

processors)Electric power grid (voltage and phase fluctuations)High-performance or grid-computing networks (task

completion landscapes in distributed simulations)Synchronization of coupled nonlinear oscillators

(phase or frequency)Spread of epidemics in cities, human contact networks, or

worldwide airline transportation networksDissemination of culture and language in social networksBlogosphere ‘s evolution of topics of discussions

27

Models for opinion/agreement dynamicsModels with binary or multiple “opinions” with no interaction constraints:

kinetic Ising model (Glauber, ’63, …, Castellano et al. ‘05)voter model (Krapivsky, ‘92, … Ben-Naim, Redner)Naming Game (Baronchelli et al. ‘05, Lu et al. ‘06)

Models with many (discrete or continuous) opinions and with interaction thresholds or constraints:individuals too far away on some scale of opinions (or cultural traits) will never interact or convince one another.

continuous opinion dynamics (Deffuant et al. ‘00, Stauffer et al.‘03, Ben-Naim ‘05)

dissemination of culture with discrete number of cultural features and traits (Axelrod, 1997, Castellano et al. ’00, Klemm et al. ’03,

Mazzitello et al. ‘07, Candia ‘08)28

Steady-state (equilibrium) phase diagram:There exists a critical threshold, above which global consensus is reached and prevails.

Small-world networks: the presence of random links substantially decreases this critical threshold so the region of global consensus expands

Scale-free networks: for finite networks, there is a system size-dependent critical threshold, but Tc(N) 0 as N ∞. For larger and larger networks, the region of global consensus progressively dominates the steady-state phase diagram.

Models with interaction constraintson networks without community structure

29

Language Games, Semiotic Dynamics

Artificial and autonomous software agents or robots bootstraping a shared lexicon without human intervention (Steels, 1995)

Collaborative tagging: human web users spontaneously create loose categorization schemes (“folksonomy”). See, e.g., del.icio.us and www.flickr.com(Golder & Huberman, ‘05; Cattuto et al., ’05, ‘07 )

30

The “Talking Heads” Experiment(the Guessing Game)

“The artificial agents start with no prior human-supplied set of categories nor lexicon. A shared ontology and lexicon must emerge from scratch in a self-organized process.”

(L. Steels, 1995, …)

http://talking-heads.csl.sony.fr/

31

Naming Game (for a single object)

blahkefe

okoeta

blahkefe

okoetablah

“speaker” “hearer” “speaker” “hearer”

“failure”

blahkefe

okoetablah

blah blah

“speaker” “hearer” “speaker” “hearer”

“success”

Baronchelli, Felici, Loreto, Caglioti, and Steels (2005)32

Language Games in Sensor Networks

33

Language Games in Sensor NetworksAssumptions

Mobile or static sensor nodes deployed in large spatial regions

•environment is unknown, possibly hostile•tasks are unforeseeable•sensor nodes have no pre-constructed vocabularies

Must autonomously develop common shared vocabularies/language at the exploration stage

34

Naming Game on Random Geometric Graphs (RGG)

2d RGG

Random geometric networks (above the percolation threshold)(spatial and random)• nodes are connected if they fall within each other’s radio range• communications: broadcast to local neighbors 35

Long Term Evolution of Naming Game on RGGs

∞=t36

NG on regular and random networks with no community structure: consensus is always reached, and temporal

behavior is similar to fully-connected model

Heterogeneous spatial and social graphs can havestrong community structure.

Social networks: graphs with community structures

heterogeneous random geometric graph using the LandScanTM US population data

37

NG on Heterogeneous Spatial Networks

using LandScanTM US population datato construct a heterogeneous

random geometric graphs

Thanks to Ahn and Barabási

38

http://popvssoda.com:2998/

39

Coarsening in d dimension, N individuals(with non-conserved order-parameter)

γξ tt ~)(

γ

ξd

d tNt

NN −~)(

~d

γ

ξξ

ξ−−− tN

tNt

tNNN ddw ~

)(~)(

)(~ 1

γξ dd tttC ~)(~)(

)/(1~ γdc Nt

single lengths scale :

(domain/cluster size:)

number of different words:

Nw: total number of words

ξ

)2/1:1( == γd

γdtNN −= ~1d

consensus (single word, shared by all agents):

40

Temporal Scaling in NG on 2d RGG

10.1~ Ntc03.038.0 ±=γ

γ

γ2

γ2

≈≈≈≈

41

Temporal Behavior in NG

(complete graph)

2d

total number of words

number of different words

success rate

Baronchelli et al. (2005)

(complete graph)2d

(complete graph)

2d

42

Naming Game on Random Geometric Networks and on Small-World-like RGGs

2d SW-RGG, density of random links: p2d RGG

γξ tt ~)( dSW pl /1~ −

)/(1~ γdpt −×

SWl

30.1~ Ntc42.0~ Ntc

43

10.1~:)2( NtRGGd c

31.0~:)2( NtRGGSWd c−50.0~:)( NtMFFC c

Temporal Behavior in NG on SW-RGG

44

Average consensus times

standard deviationaverage

2~:1 Ntd c=)10.(1~:2 Ntd c=

d-dimensional coarsening:γd

c Ntdd /1~:∗<

:4≈> ∗dd5.0~:)( NtFCd c∞=

αNtt cc ~~ ∆

1d1d--regreg2d2d--regreg

FCFC

ct

2d2d--RGGRGG

2d2d--SWSW--RGGRGG

45

)1(~/max οNNw

Temporal scales in NG on networksNaming Game in small-world and scale-free networks:Optimal trade-off between FC and behavior:

Ordering process is relatively fast, as in FC network (effectively mean-field dynamics)Memory requirement is small, as in spatial networks (sparse networks with finite average degree)

50.0~ Ntc

1d1d 2d2d(RGG or (RGG or regular)regular)

FCFC(complete

graph)

SWSW(on 1d or 2d (on 1d or 2d

RGG or regular)RGG or regular)

SFSF(scale-free)

memory needper agent const. const. const. const.

consensus time

(* or faster)

NNw /max

ct2N

5.0N

5.0N10.1N 31.0N 40.0N

( )kc

(Dall’Asta et al.’06)

3/1~)( kkP

46

32/1~)(

<< γ

γkkP

Many equilibrium and dynamic models on these networks display mean-field features

NG on regular and random networks with no community structure: consensus is always reached

Social graphs have strong community structure.

Social networks: graphs with community structures

47

Summary of the Results

NNdd=3=3

Social Networks typically exhibit

strong community structurestrong community structure

Different sub-communities preserve their own

ideas/standards so it is hard to reach global consensus (e.g., q-state Potts model,

Naming Game (NG))

Cellphone network with~ 4 million agents (A.-L. Barabasi)

NNdd=2=2

Highschool friendship network with ~ 1,000 agents

48

Engineering consensus in social networks

• By indoctrinating a small number of agents to dissolve community structures

• One community (say, A) tries to select a small number of agents to convert them to stick with A’s idea/standard and change the consensus of sub-community.

These agents are then called Indoctrinated nodes/agents

1

2

3

*Visualization of community dissolving by indoctrination can be viewed and downloaded at:http://horizon.phys.rpi.edu/~korniss/NGFrozen.avi

49

Engineering consensus in social networksEngineering consensus in social networksby indoctrination

Naming Game in high school friendship networks

3 clusters are initially formed

Indoctrinated agents are indicated with yellow circles

50

Various ways of selecting frozen agents:Various ways of selecting frozen agents:•• Randomly pickRandomly pick•• Most active nodes (highest opinion flipping count)Most active nodes (highest opinion flipping count)•• Highest neighborhood size (degree)Highest neighborhood size (degree)•• Betweenness centrality (shortest path)Betweenness centrality (shortest path)•• Betweenness centrality (current/random walk)Betweenness centrality (current/random walk)

Properly picking indoctrinated agents will

greatly speedup the converting process than

picking agents blindly(exponentially relaxation

vs. powerlaw)

ns(t):

fraction of runs that have not reached global consensus by time t

How to select target agents of indoctrination to effectively speedup the

process of community merging?

)1147( =NnetworkHS

τ/~)( ts etn −

)(tns

51

Indoctrination in a HS NetworkIndoctrination in a HS Network

The system experiences a phase transition from persistent community

structure to global consensus when increase

the number of indoctrinated agents.

Picking more agents is a waste of resources after

certain point.

Fraction of indoctrinated agents

How many agents need to be indoctrinated How many agents need to be indoctrinated to successfully convert a community?to successfully convert a community?

)(tns

52

Indoctrination Speed

)104( 6×≈NNetCell

Fraction of indoctrinated agents 53

Indoctrination in a Large Graph

Behavior similar to HS networks

Open Questions

54

How to recognize good indoctrinators?

How to predict time to community collapse?

How stable are communities during NG?

What is NG dynamics on overlapping networks?

Thank you!

adversary social networks: community formation, detection and consensus engineering€¦ · ·...

Documents