adversary social networks: community formation, detection and consensus engineering€¦ · ·...
TRANSCRIPT
Boleslaw Szymanski Social Cognitive Networks
Academic Research Center at Rensselaer
Adversary Social Networks: Community Formation, Detection
and Consensus Engineering
http://www.cs.rpi.edu/~szymansk
Goals: Discovery and Analysis of Social Communities
Statistical processing of social interaction data
Vast
Stochastic
Dynamic
Find communities, especially hidden communities.
2
Communications supporting IED planning have
patterns and are correlated….
Analysis of the patterns can reveal the groups as well
as their internal group structure.
3
Application: Finding and Tracking Groups Hidden in Time and Space
Why Statistical Analysis
• Agents often do not know/declare their communities.
• When they do it is largely false:– Blogs (declared friend-links and communities)
• Declared friend-links predict communications with 0.04 F1-measure (small)
• Declared communities predict communications with 0.0002 F1-measure (even smaller)
4
Data: Large Dynamic Social Networks
Examples:• LiveJournal Blog network
– Very large, dynamic social network• Twitter• Email, Chatrooms• Data repository for large scale dynamic
social networks
5
System Components
F
A
C
E G
D HI
B
SIGHTS
b u y , t r a d e . . . b u y
2 t r a d e . . . 2 t r a d e
3 , s e l l . . . 3 , h e l l
Pattern id = 2Pattern = “buy,”
Pattern id = 3Pattern = “2trade”bb
Level 0
Level 1
Level 2
RDM-CSA
Higher ranked leaders
Group leader
Subgroup leaders
Members 6
Communications
7
Time: January 12, 2010, 09:35
From: [email protected]
To: [email protected]: Hello
Message: Where have you been?
16:06:31] <FreeTrade> Republicans were the worst pacifists before ww1 and ww2[16:06:43] <SweetLeaf> France Fries
[16:06:50] <FreeTrade> As a generality, of course their were Republican Hawks.[16:07:13] <FreeTrade> Sweet, good pun but bad story!
[16:07:18] <SweetLeaf> yup[16:07:23] <Lupine> anyways, he's perpetually tormented by presidential actions
[16:07:25] <SweetLeaf> it aint good for no one[16:07:47] <SweetLeaf> I think they knew it was commiing
[16:07:51] <FreeTrade> Rossevelt met monthly in New York with mostly trusted Republicans to talk about how to get america into the war.
[16:08:10] <FreeTrade> and he spent 2 year with Churchill meeting him sometimes secretly in the ocean to discuss the same topic.
[16:08:22] <FreeTrade> Exchanging a lot of letters.[16:08:25] <FreeTrade> telegrams
[16:08:28] <Lupine> There really is nothing like a shorn scrotum. It's breathtaking, I suggest you try it.
[16:08:55] <FreeTrade> Well they didnt literally meet in the ocean, they were on ships.
Streaming Example
Time From To Message10:00 Alice Charlie Golf tomorrow? Tell everyone.10:05 Charlie Felix Alice mentioned golf tomorrow.10:06 Alice Bob Hey, golf tomorrow. Spread the word.10:12 Alice Bob Tee off: 8am at Pinehurst.10:13 Felix Grace Hey guys, golf tomorrow.10:13 Felix Harry Hey guys, golf tomorrow.10:15 Alice Charlie Pinehurst Tee time: 8am.10:20 Bob Elizabeth We’re playing golf tomorrow.10:20 Bob Dave We’re playing golf tomorrow.10:22 Charlie Felix Tee time 8am at Pinehurst10:25 Bob Elizabeth We tee off 8am at Pinehurst.10:25 Bob Dave We tee off 8am at Pinehurst.10:31 Felix Grace Tee time 8am, Pinehurst.10:31 Felix Harry Tee time 8am, Pinehurst.
8
Streaming Example
Time From To10:00 Alice Charlie10:05 Charlie Felix10:06 Alice Bob10:12 Alice Bob 10:13 Felix Grace10:13 Felix Harry10:15 Alice Charlie 10:20 Bob Elizabeth10:20 Bob Dave10:22 Charlie Felix10:25 Bob Elizabeth 10:25 Bob Dave10:31 Felix Grace10:31 Felix Harry
A
C
F
HG
B
D E
9
Communication graph
10What are the social groups/coalitions?
January 12, 2010, 09:35
[email protected]@xyz.com
Social groups are overlapping clusters
• Clusters may overlap.
11
Social groups are overlapping, introverted clusters
• Clusters may overlap– A cluster is a locally defined object in which group
members are more introverted than extroverted
12
YES
NO
Social groups are persistent overlapping, introverted clusters
• Clusters may overlap• A cluster is a locally defined object in which
group members are more introverted than extroverted.
• Social groups (clusters) persist
13
Examples
GROUND TRUTH• Group A
– Dog– Vulture– Camel– Yassir Hussein– Bird– (6 others)
• Group B– Ahmet – Saleh Sarwuk– Shaid– Pavlammed Pavlah– Osan Domenik
14
Two clusters: Electric circuit design; Optimization of Neural Networks:
Intersection: “Sensitivity analysis in degenerate quadratic programming”
Citeseer
ENRON
SIGHTSGroup A
Dog Vulture Camel
Gopher
Group BAhmet
Saleh Sarwuk ShaidDajik
Ali Baba Data Set (DoD)
SIGHTSStatistical Identification of Groups Hidden in Time and Space
- System for statistical analysis of social coalitions in communication networks
Data SourcesBlogs
TwitterEmails (Enron)
ChatroomSynthetic data
Coalition DiscoveryOverlapping Clustering
Streaming groupsPersistent groups.
Coalition AnalysisLeaders
Opposing groupsTopic matching
VisualizationsSize-Density plotsStatic coalitions
Dynamic coalitions
Groups matching analyst topic in red
Size vs. Density Plot
Visualization options
Choose time window
Groupmembers
Different analyses on dataset
Leader index
Spreading of information: Russian Blog: Kosovo, week 40, 2007
Spreading of information: Russian Blog: Kosovo, week 05, 2008
Spreading of information: Russian Blog: Kosovo, week 06, 2008
Spreading of information: Russian Blog: Kosovo, week 07, 2008
Spreading of information: Russian Blog: Kosovo, week 08, 2008
Spreading of information: Russian Blog: Kosovo, week 09, 2008
Spreading of information: Russian Blog: Kosovo, week 10, 2008
Spreading of information: Russian Blog: Kosovo, week 11, 2008
Spreading of information: Russian Blog: Kosovo, week 12, 2008
Twitter and Trust
• Users: 1,894,215 • Messages per week: 3,750,000
25
Behavioral Trust based on Communications
• Modeling trust:– Define and collect conversations– Define trust based on
conversations• Stability, frequency, depth• Reciprocity• Conversation Propagation
• Model Validation (collaborations):
– David Lazer (Harvard)– Brian Uzzi (Notre Dame)
0 10 20 30 40 50 600
0.5
1
1.5
2
2.5x 10
4
Size
# st
rong
ly “
trus
ting”
com
pone
nts
26
Collective dynamics on the networkExamples:Internet (packet traffic/flux)Load-balancing schemes (job allocation among
processors)Electric power grid (voltage and phase fluctuations)High-performance or grid-computing networks (task
completion landscapes in distributed simulations)Synchronization of coupled nonlinear oscillators
(phase or frequency)Spread of epidemics in cities, human contact networks, or
worldwide airline transportation networksDissemination of culture and language in social networksBlogosphere ‘s evolution of topics of discussions
27
Models for opinion/agreement dynamicsModels with binary or multiple “opinions” with no interaction constraints:
kinetic Ising model (Glauber, ’63, …, Castellano et al. ‘05)voter model (Krapivsky, ‘92, … Ben-Naim, Redner)Naming Game (Baronchelli et al. ‘05, Lu et al. ‘06)
Models with many (discrete or continuous) opinions and with interaction thresholds or constraints:individuals too far away on some scale of opinions (or cultural traits) will never interact or convince one another.
continuous opinion dynamics (Deffuant et al. ‘00, Stauffer et al.‘03, Ben-Naim ‘05)
dissemination of culture with discrete number of cultural features and traits (Axelrod, 1997, Castellano et al. ’00, Klemm et al. ’03,
Mazzitello et al. ‘07, Candia ‘08)28
Steady-state (equilibrium) phase diagram:There exists a critical threshold, above which global consensus is reached and prevails.
Small-world networks: the presence of random links substantially decreases this critical threshold so the region of global consensus expands
Scale-free networks: for finite networks, there is a system size-dependent critical threshold, but Tc(N) 0 as N ∞. For larger and larger networks, the region of global consensus progressively dominates the steady-state phase diagram.
Models with interaction constraintson networks without community structure
29
Language Games, Semiotic Dynamics
Artificial and autonomous software agents or robots bootstraping a shared lexicon without human intervention (Steels, 1995)
Collaborative tagging: human web users spontaneously create loose categorization schemes (“folksonomy”). See, e.g., del.icio.us and www.flickr.com(Golder & Huberman, ‘05; Cattuto et al., ’05, ‘07 )
30
The “Talking Heads” Experiment(the Guessing Game)
“The artificial agents start with no prior human-supplied set of categories nor lexicon. A shared ontology and lexicon must emerge from scratch in a self-organized process.”
(L. Steels, 1995, …)
http://talking-heads.csl.sony.fr/
31
Naming Game (for a single object)
blahkefe
okoeta
blahkefe
okoetablah
“speaker” “hearer” “speaker” “hearer”
“failure”
blahkefe
okoetablah
blah blah
“speaker” “hearer” “speaker” “hearer”
“success”
Baronchelli, Felici, Loreto, Caglioti, and Steels (2005)32
Language Games in Sensor Networks
33
Language Games in Sensor NetworksAssumptions
Mobile or static sensor nodes deployed in large spatial regions
•environment is unknown, possibly hostile•tasks are unforeseeable•sensor nodes have no pre-constructed vocabularies
Must autonomously develop common shared vocabularies/language at the exploration stage
34
Naming Game on Random Geometric Graphs (RGG)
2d RGG
Random geometric networks (above the percolation threshold)(spatial and random)• nodes are connected if they fall within each other’s radio range• communications: broadcast to local neighbors 35
Long Term Evolution of Naming Game on RGGs
∞=t36
NG on regular and random networks with no community structure: consensus is always reached, and temporal
behavior is similar to fully-connected model
Heterogeneous spatial and social graphs can havestrong community structure.
Social networks: graphs with community structures
heterogeneous random geometric graph using the LandScanTM US population data
37
NG on Heterogeneous Spatial Networks
using LandScanTM US population datato construct a heterogeneous
random geometric graphs
Thanks to Ahn and Barabási
38
http://popvssoda.com:2998/
39
Coarsening in d dimension, N individuals(with non-conserved order-parameter)
γξ tt ~)(
γ
ξd
d tNt
NN −~)(
~d
γ
ξξ
ξ−−− tN
tNt
tNNN ddw ~
)(~)(
)(~ 1
γξ dd tttC ~)(~)(
)/(1~ γdc Nt
single lengths scale :
(domain/cluster size:)
number of different words:
Nw: total number of words
ξ
)2/1:1( == γd
γdtNN −= ~1d
consensus (single word, shared by all agents):
40
Temporal Scaling in NG on 2d RGG
10.1~ Ntc03.038.0 ±=γ
γ
γ2
γ2
≈≈≈≈
41
Temporal Behavior in NG
(complete graph)
2d
total number of words
number of different words
success rate
Baronchelli et al. (2005)
(complete graph)2d
(complete graph)
2d
42
Naming Game on Random Geometric Networks and on Small-World-like RGGs
2d SW-RGG, density of random links: p2d RGG
γξ tt ~)( dSW pl /1~ −
)/(1~ γdpt −×
SWl
30.1~ Ntc42.0~ Ntc
43
10.1~:)2( NtRGGd c
31.0~:)2( NtRGGSWd c−50.0~:)( NtMFFC c
Temporal Behavior in NG on SW-RGG
44
Average consensus times
standard deviationaverage
2~:1 Ntd c=)10.(1~:2 Ntd c=
d-dimensional coarsening:γd
c Ntdd /1~:∗<
:4≈> ∗dd5.0~:)( NtFCd c∞=
αNtt cc ~~ ∆
1d1d--regreg2d2d--regreg
FCFC
ct
2d2d--RGGRGG
2d2d--SWSW--RGGRGG
45
)1(~/max οNNw
Temporal scales in NG on networksNaming Game in small-world and scale-free networks:Optimal trade-off between FC and behavior:
Ordering process is relatively fast, as in FC network (effectively mean-field dynamics)Memory requirement is small, as in spatial networks (sparse networks with finite average degree)
50.0~ Ntc
1d1d 2d2d(RGG or (RGG or regular)regular)
FCFC(complete
graph)
SWSW(on 1d or 2d (on 1d or 2d
RGG or regular)RGG or regular)
SFSF(scale-free)
memory needper agent const. const. const. const.
consensus time
(* or faster)
NNw /max
ct2N
5.0N
5.0N10.1N 31.0N 40.0N
( )kc
(Dall’Asta et al.’06)
3/1~)( kkP
46
32/1~)(
<< γ
γkkP
Many equilibrium and dynamic models on these networks display mean-field features
NG on regular and random networks with no community structure: consensus is always reached
Social graphs have strong community structure.
Social networks: graphs with community structures
47
Summary of the Results
NNdd=3=3
Social Networks typically exhibit
strong community structurestrong community structure
Different sub-communities preserve their own
ideas/standards so it is hard to reach global consensus (e.g., q-state Potts model,
Naming Game (NG))
Cellphone network with~ 4 million agents (A.-L. Barabasi)
NNdd=2=2
Highschool friendship network with ~ 1,000 agents
48
Engineering consensus in social networks
• By indoctrinating a small number of agents to dissolve community structures
• One community (say, A) tries to select a small number of agents to convert them to stick with A’s idea/standard and change the consensus of sub-community.
These agents are then called Indoctrinated nodes/agents
1
2
3
*Visualization of community dissolving by indoctrination can be viewed and downloaded at:http://horizon.phys.rpi.edu/~korniss/NGFrozen.avi
49
Engineering consensus in social networksEngineering consensus in social networksby indoctrination
Naming Game in high school friendship networks
3 clusters are initially formed
Indoctrinated agents are indicated with yellow circles
50
Various ways of selecting frozen agents:Various ways of selecting frozen agents:•• Randomly pickRandomly pick•• Most active nodes (highest opinion flipping count)Most active nodes (highest opinion flipping count)•• Highest neighborhood size (degree)Highest neighborhood size (degree)•• Betweenness centrality (shortest path)Betweenness centrality (shortest path)•• Betweenness centrality (current/random walk)Betweenness centrality (current/random walk)
Properly picking indoctrinated agents will
greatly speedup the converting process than
picking agents blindly(exponentially relaxation
vs. powerlaw)
ns(t):
fraction of runs that have not reached global consensus by time t
How to select target agents of indoctrination to effectively speedup the
process of community merging?
)1147( =NnetworkHS
τ/~)( ts etn −
)(tns
51
Indoctrination in a HS NetworkIndoctrination in a HS Network
The system experiences a phase transition from persistent community
structure to global consensus when increase
the number of indoctrinated agents.
Picking more agents is a waste of resources after
certain point.
Fraction of indoctrinated agents
How many agents need to be indoctrinated How many agents need to be indoctrinated to successfully convert a community?to successfully convert a community?
)(tns
52
Indoctrination Speed
)104( 6×≈NNetCell
Fraction of indoctrinated agents 53
Indoctrination in a Large Graph
Behavior similar to HS networks
Open Questions
54
How to recognize good indoctrinators?
How to predict time to community collapse?
How stable are communities during NG?
What is NG dynamics on overlapping networks?
Thank you!