Download - Network Structure For Social Network
Literature Survey to discuss topographical structure
of social networks and information propagation Sathe, Vaibhav
1
Indian Institute of Management Lucknow
IIM Campus, Prabandh Nagar, Off Sitapur Road, Lucknow, Uttar Pradesh – 226013, INDIA [email protected]
I. INTRODUCTION
Facebook’s currently 800 million and continuously growing
user base and increasing trend in time spent has attracted a lot
of attraction from researchers in various fields. Recently
Facebook has been used as platform for organizing mass
protests in countries of middle-east. Even looking at events in
India like rise of India against Corruption and their Facebook
following of 500,000 people has underscored rising power of
social media. This has resulted in clashes with governments
which are seeking to curtail power of social networks and its
users to spread messages without restrictions. In our research,
we want to model this censorship activity. This literature
survey is being conducted to support the research by
understanding network concepts required for modelling social
networks, primarily in areas of structure of network and how
message spreads.
We will review some well cited papers published on top
Information Systems journals to identify various dimensions
required for modelling exercise.
II. PROBLEM DEFINITION
Following are objectives of this literature review.
(1) Structure of Social Networks:
In order to model social network, we need to determine
which model from network science applies to social
network. Probable options are small world, random
network and scale free network. It is also noted that
different social networks may d isplay different structures
due to fundamental differences. From point of view of
censorship, we will focus more on social networks like
Facebook. Facebook clearly holds largest interest due to
largest user base which gives it capability to influence
behaviour of actors involved in censorship related study.
(2) Information Propagation Pattern:
In order to identify parameters that model interactions of
users on social network which lead to information
diffusion, we need to understand how informat ion spreads
on networks and what all factors affect it.
III. LITERATURE SEARCH
The literature surveyed for this is divided into following
sections.
A. Structure of Social Networks
Following articles contribute to first objective to determine
structure of social networks . Detailed reference is included in
references section.
Sr. Article/Paper Journal/Publisher
1 Measurement and Analysis of
Online Social Networks
ACM
2 Linking via Social Similarity: The
Emergence of Community
Structure in Scale-free Network
IEEE
3 A fast algorithm for simulating
scale-free networks
ICCTA (IEEE)
4 Social Search in “Small-World”
Experiments
World Wide Web
Consortium
5 Reciprocity in evolving social
networks
Journal of
Evolutionary
Economics
B. Information Propagation
Following articles contribute to second objective of
determining patterns in informat ion spread. Detailed reference
is included in references section.
Sr. Article/Paper Journal/Publisher
1 Network Effects and Personal
Influences: The Diffusion of an
Online Social Network
Journal of
Marketing
Research
2 Forward or delete: What drives
peer-to-peer message propagation
across social networks?
Journal of
consumer
behaviour
3 User Interactions in Social
Networks and their Implications
EuroSys’09, ACM
4 Online organization of offline
Protest: From Social to Tradit ional
Media and Back
HICSS 2011
5 Information propagation analysis
in a social network site
IEEE
6 Detecting and Characterizing
Social Spam Campaigns
IMC’10, ACM
IV. TERMINOLOGIES
Let’s look at some terminologies in detail required to
understand concepts discussed in this review.
Power Law:
When frequency varies inversely with power of
quantifiable size of event, the relationship is said to follow
power law. One of the characteristics of such distribution is
large difference between mean and median.
Types of networks:
A. Random Networks
Random network are unstructured networks with low
clustering. They do not occur in nature. They are theoretically
studied to provide baseline for study of more structured
networks like small world and scale free.
B. Small World Network
Small world networks are networks which have small
average path length due to large number of interconnections
and high cluster coefficient.
C. Scale-Free Network
Scale-free networks are those whose degree sequence
distribution follows power law. i.e. the network consists of
Small number of highly connected users and large number of
less connected users.
Terms related to networks:
(1) Network Diameter: Maximum internode distance is called
diameter of network.
(2) Indegree: No. of inward connections for given user.
(3) Outdegree: No. of outword connections for given user.
This is valid measure when networks are directed graphs.
Network like Facebook and Orkut are symmetrical
networks i.e . for any user, indegree and outdegree are
equal.
(4) Assortativity: It is measure of likeliness that nodes in
network establish link with other node which is similar to
it on some parameter.
Information shared on Facebook:
The informat ion that is created and shared on Facebook
comes from various sources. These are as follows:
(1) Status Messages: Users can share text message as their
status message. This is visible to other users (friends or
others) on user’s wall. The message also appears in news
feed of other users which are friends or/and subscribed to
user’s updates.
(2) Hyperlink: A hyperlink to some other location on Internet,
typically news of interest, is another source of shared
informat ion. Friends can like, share, comment on such
links.
(3) Photo: Photographs, typically taken by user, are
frequently shared, liked and commented.
(4) Community/Group: Facebook has different groups
dedicated to various topics. Message posted by or on the
community is typically shared by user so that his
subscribers can view it, which may not have access to the
community.
(5) Person: Famous people like Bill Gates have their own
personal pages which are not like groups. These are used
by sending personal images and links to thousands of
subscribers in similar way as these personalities are using
twitter today. This is for one-way communication.
(6) Event Invitations: Users can create events and invite
people. Users can also forward event invites.
V. DATA EVALUATION
This section is split into sections as below.
A. Social Networks
Before starting, let’s look at what is meaning of social
networks and how online social networks are different.
Social Network concept applies to naturally formed
networks like community, family t ies and relationships etc.
For e.g. In a town, people know each other in one residential
area. They also know some more people at workp lace. There
is also tendency that they want to know more people and try to
gain access to larger contacts through person they think is
well-connected. The information exchange may be intentional
or unintentional. The study of social networks focusses on
critical issues like d isease spread, news spread, riots, fads,
social awareness etc.
Online social networks demonstrate similar characteristics
with exception that users are not in physical connection with
each other. Examples of online social networks include
Facebook, Twitter, Flickr, YouTube or any other sites which
facilitate interaction between users. This can be one-one
(Google talk) or one-many (Facebook) or many-many (Forum)
depending on nature of the site.
B. Structure of Social Networks
What graph structure social networks follow has been very
interesting topic for the researchers as it is fundamental step in
any modelling or simulation on the network.
Mislove et al [2]
in their paper on measurement and analysis
of social network try to identify various characteristics of
social network. In the experiment they collected data from
over 11.3 million users of Orkut, Youtube, Flickr and
LiveJournal. When network analysis was done on each
network, these networks followed Power Law. In addit ion,
they identified that these social networks display scale-free
and small world properties. All networks have high clusters.
Authors have identified interesting parameter that whether
consent is required from second party to establish connection
by first party. The example is twitter, where anyone can
follow you and you need not follow him. But on other hand,
on Facebook, if somebody wants to be friends with you then
he needs to send request and only when you approve, you both
become friends to each other. Twitter is example of
asymmetric network which has different indegree and
outdegree for each user. Facebook is example o f symmetric
networks where each user has identical indegree and
outdegree. Based on these parameters, characteristics of
network will vary. Symmetric networks have more
connections among users and hence, they form stronger
clusters thereby reducing network diameter. Hence, they
display characteristics of small world network. Among
examples taken for analysis by author, we need to focus more
on example of Orkut as it is most closely related to Facebook.
To understand limitations, we need to note complex structure
of Facebook. Although friendship is one of the prime ways
Facebook disseminates informat ion, we need to consider other
ways like groups, pages where user subscribes thereby
creating directed or asymmetric relat ionship. Nowadays,
Facebook is also allowing users to subscribe to status updates
from other users without requirements of explicit consent.
This has resulted in formation of Facebook has hybrid
network with different types of nodes. With regards to cluster
formation, the authors state that the online social networks
score higher on assortativity on parameter that users of high
degree establish relation with other users of high degree while
users of low degree establish relation with other users of low
degree. This looks in violat ion with scale-free properties
where low degree users have tendency to attach to high degree
users more in order to form Hub and Spoke model.
The social networks are examples of very large scale
networks and they are not random. Study by Erdos and Renyi [6]
proved that networks like social networks evolve with
particular patterns and they have certain structure, but not
random.
Wei Ren and Jianping Li’s [4]
paper proposes RX algorithm
to simulate scale free network, which they claim is better
performing than popular Barabasi-Albert (BA) algorithm.
Authors state that as number of nodes increase, the time
required for RX is much lesser compared to that taken by BA.
They conclude that the networks that expand continuously
exhibit characteristics of scale-free networks. And since,
social networks are both very large in size as well as
continuously expanding, scale-free characteristics apply. The
same is true about online social network like Facebook, which
has currently 800 million users and is increasing in terms of
total users as well as average number of friends at very rapid
rate.
Yixiao Li et al [3]
in their paper, make important
observations that social network model exh ib its community
structure. This paper however correctly establishes clustering
method based on “Birds of feather flock together”, stating that
users having something in common tend to form clusters or
groups with a lot of interconnections among them. This does
not agree with statement in paper of Mislove [2]
, which stated
that users with high degree have tendency to connect to other
users with high degree and vice versa. Further this paper
establishes that communities develop into scale-free networks
when they keep expanding.
There is one more factor discussed in literature on user’s
intention. As explained in paper by Goel et al [7]
, from
physical social network standpoint, the topological connection
and algorithmic connection (intention to connect) with
example of spread of diseases in social network. The paper
distinguishes in network structure based on intention of user.
Next paper discussed below extends this concept by looking
into factor when such intentions evolve, making network very
dynamic.
The paper by Jun and Sethi [8]
discusses how social network
structure is developed in dynamic and continuously evolving
environment. The changes in network result as random
rewiring. Also, to certain extent, some old links are severed
over period of t ime. In physical as well as online social
networks it is due to changes in one’s lifestyle in terms of
location, community memberships etc. Also, changes may
happen in intention factor which is taken as conditional
cooperation. Over period o f t ime, user’s reasons to connect
can evolve e.g. looking for relationship, friendship or
professional networking. Another important observation by
the authors is based on increasing degree of network. With
increasing degree, the clustering increases as neighbours of
one node are likely to be neighbours of each other. Th is is
same phenomenon that social network like Facebook fo llows.
Hence, the diameter of network reduces. This paper identifies
future research scope in terms of in fluence of behaviour of
non-neighbours on given user. This is also valid scenario
considering features of Facebook. User A may receive updates
from interaction of particu lar friend B to his friend C who is
not friend of user A. We will discuss this propagation in next
section.
C. Information Propagation
Harvey et al [9]
in their paper on v iral marketing on Internet
researched how users Forward or Delete particu lar message on
social network like YouTube. From our research point of view,
observations on this forward ing behaviour are important as
they also apply to user behaviour on social network like
Facebook. Authors have identified that likelihood of video
being forwarded are closely correlated to sender involvement,
sender tie strength and amount of online communication
across ties. We would explain these factors in short. Sender
involvement means, as explained by Norman [10]
, is relation of
subject to person’s needs. Sender’s tie strength means how
close is the user to sender of message. Third factor on amount
of communication that sender has with p robable to whom he
would forward. Authors reject factor that knowledge of how
to forward given message has got any correlation to this.
Skoric et al [12]
in their paper discuss parameter of trust
which is similar to ties with sender which we discussed in
previous paper. Authors say that in general, user t rust their
friends over any other person like polit ical leader or advertiser.
What this means is when a friend forwards or share some
message, they consider it as serious message. This improves
likelihood that they forward such message. This research also
identifies that groups, events and status messages are the tools
on Facebook by which users can reach one’s immediate and
extended friends in fast, easily accessible and cost effective
way. One important contribution of this paper is identification
that spread of such messages will be limited in individuals
who are mostly similar and in one category of politically
engaged and socially act ive people. Th is is typically due to the
fact that such messages will spread only through friendship
networks, which are based on different intentions than
spreading such message. Friends are generally of s imilar
thought process and hence similar on above parameters.
Katona et al [1]
brings out some crit ical points based on
sender’s influence in their paper. First, they discussed that as
number of contacts of recipient increase, influencing effect
that particular indiv idual has on him gets diluted accordingly.
Second factor is of brokers. We have already seen that social
networks demonstrate characteristics of scale-free and small
world networks. This means that among different clusters of
users there are few users which are common, which form
prominent nodes linking these two clusters. As proved
empirically, since they control large amount of informat ion,
they have higher influential power.
Another very interesting observation is made by Wilson et
al [11]
in their paper. Authors say that links or connections on
social network like Facebook are not indicators of interaction
among them. This is primarily due to time constraints that
users face. So, all the friendships are not equally meaningful.
Authors therefore have come up with new concept of
interaction graph as valid indicator to map social connectivity
than Facebook updates. Interesting observation they have
made that such interaction graph does not exhib it small world
characteristics. Therefore, authors believe more in the scale-
free network pattern when it comes to interactions that happen
within users.
In paper by Magnani et al [13]
, authors have identified some
important dimensions of discussion. The average lifet ime of
post or message is the time for which it is availab le on news
feeds of user. It will vary inversely with number of friends the
user has and their frequency of activity on Facebook. Overall,
authors have found that such lifetime of post also follows
power law. Based on their empirical analysis it was found that
50% of entries survive fo r around one hour, 85% survive for a
day and so on. Authors have also identified specific time trend
in content generation. Since users in given clusters have some
parameters in common, any temporal factors affecting those
parameters will also affect activity of all users simultaneously.
One important issue that needs attention is increasing
quantity of spam. The paper by Gao et al [14]
, looks at
quantifying and characterizing online spam campaigns
launched by online social network accounts. Important
observation from this empirical study of 3.5 million Facebook
users indicate that over 97% of accounts are compromised
accounts and only rest are fake accounts. Another observation
is that spamming activ ity is more generally at early morning
hours for users based on their local time.
VI. ANALYSIS AND INTERPRETATION
A. Network Structure
Based on reviews of art icles in section on network structure
above, we find that Mislove’s art icle [2]
develops many
concepts required for understanding how this structure
develops. But, with help of community as example from Yixio
Li et al [3]
, we can get idea how social networks evolve. This
helps in understanding why social networks display
characteristics of both small world networks and scale-free
networks.
Initially a group of individuals with something in common
like belonging to same school come together on network like
Facebook. They add each other as links, thereby establishing
community structure. This is also a cluster of users tightly
coupled with each other. Th is behaves like Small World
network due to shorter diameter. As time progresses, the
individuals from these clusters may get exposed to a different
group or set of users. Now this particular user becomes
connection between these two clusters. That way, this
individual will have much higher degree of links than his
earlier cluster peers. This develops into hub and spoke model
and thereby into scale free networks. These follow Power Law,
as there are lesser users connected across clusters and hence
have higher degree, than large number of users connected only
within cluster, therefore have lesser degree of links.
Another parameter that impacts expansion of social
networks is how users can search other users in order to
connect them. Networks like LinkedIn allow users to search
only within certain levels of neighbourhood. This limits
capability of less connected users to connect to large number
of users. This further provides incentive to user to connect to
another user which is highly connected. This simple behaviour
contradicts concept given in paper of Mislove [2]
that users of
similar degree are more likely to connect to each other.
Scenario of linking unintentionally is not applicable to
online social network like Facebook as there is no reason to
believe that two users are connected to each other unless they
have some intention to do so. At least one user will have some
reason to connect to other, second user may approve request
unknowingly. Additionally it may need to be noted that the
intentions of different users connecting to each other may be
different. What this means is one user A intends to connect to
user B for reason X. But user B wants to connect to user A for
reason Y and still they can establish connection as long as
both users agree. But if there is no reason Y for B to connect
to A then the link will not establish. However, we could not
locate any literature modelling the network taking into
account heterogeneous intentions.
B. Information Propagation
As literature explains, we have several factors that define
the pattern of propagation of information. However, we need
to alter some conditions when we apply these to our research
for purpose of understanding how a message spreads over
social network like Facebook, fundamentally due to several
differences in characteristics of Facebook against social
networks that were considered for empirical research in
literature researched.
As against preferential forwarding discussed in paper by
Harvey et al [9]
, on Facebook, the user would forward i.e.
share message that he likes to all of his friends and those who
are subscribed to his updates. Very few times he would share
such message with particu lar Facebook user. However, we
need to note that he can preferentially tie up some users based
on relevance he sees while sharing the message with larger
audience. The ways to do it are tagging a person or posting
such link or image on wall of user intended.
We also agree with Harvey’s finding that user’s knowledge
has little to do with forwarding likelihood. While looking at
this observation from Facebook’s point of view, we can’t
logically think of any reason to believe that a Facebook user
will not be aware how to share the message that he or she is
reading if at all he wants to do that.
As we have seen in the structure of social networks, the
users of similar nature come together and form clusters. This
creates strong bonds between similar people and weaker
bonds between dissimilar people. Moreover we saw that while
friendship networks are formed based on consent, the user
gives such consent based on different criteria than spreading
particular message. This results in effectively reducing
velocity of message spread as it does not reach to dissimilar
users with equal intensity.
Wilson et al [11]
have found that small world clustering does
not exist due to low degree of connection in their interaction
graph, which is different than friendship link graph. Th is is
due to the fact that users on regular basis interact with a s mall
portion of their friends. As degree of links per user from
interaction point of view decreases, clustering index reduces ,
thereby network becomes more scale-free and less small-
world.
As described by Katona et al [1]
, the dilution of influence
occurs as number o f contacts increase. This is very logical. As
number of friends on Facebook increases frequency of updates
in Feeds also increases proportionately. As pointed out by
Wilson, every user has limited time on Facebook. Hence,
likelihood that particular update will be visib le in considerable
portion of his news feed he would scroll at time reduces with
increasing number of contacts. This weakens influence level
and hence the interaction that we are looking for.
Paper written by Magnani et al [13]
discusses lifet ime of post
where it is active and accessible to friends. Overall it indicates
short lifespan of the message. We also need to note that as
clustering will increase in Facebook with more and more user
activity and more friends, average lifespan of particular
message would lower further. This further underlines point
mentioned in Wilson’s paper about constrained time makes
interaction networks rather than connection networks more
important for modelling, which are scale-free in nature.
Regarding spread of spam content, important factor from
our study point of view is that compromised accounts
contribute to 97% of spam and only 3% by fake accounts.
This further highlights that users trust their friends. Message
coming from unknown user is identified as spam easily than
the one coming from friend with whom user has closer ties.
Regarding t iming issue of the spam generation, we do not find
any relevance to our study on spread of information.
But time of content generation has critical ro le to play when
it comes to find lifet ime of the message to remain active in
news feed of the user. If message is created or shared at peak
time for local user, as per clustering of users, there is
significant evidence that most friends are geographically
collocated. And hence, there will be higher activity in the
entire cluster. This further reduces lifetime of message in the
news feed, but simultaneously increases likelihood that user
sees such message due to he or she is actively v iewing the
news feed.
Another important point is that not all content that is
frequently shared is genuine. Unfortunately we could not find
any conclusive literature on user behaviour where they
forward or share spam or incorrect information knowingly
simply for amusement purpose. This typically includes some
random so called “confidential” information about some
political leader or forged images. If users share this
informat ion unknowingly, then this behaviour can be
considered under trusting the ties which we just discussed. But,
many a times user is completely aware of fraudulent nature.
Still, either for amusement purpose or out of political or
ideological conflict with person or event in question, they find
it encouraging sharing of such material. We could not
however find any empirical research on this behaviour. It
should also be noted that users who are aware of spam, if they
think it may be harmful to them, then they do not indulge in
such activity. But when it comes to pure static spam content,
which they are sure that it won’t compromise their profiles,
they do not have objection to share or comment on it. If we
look at censorship proposals from governments, we may find
that they are largely interested in controlling such content.
VII. LIMITATIONS
Facebook is continuously updating its features. Literature
suggests that new features have significant impact on user
behaviour. For newly introduced timeline feature, which
allows users to view past important interactions with ease, has
greater significance on user interactivity. But, we could not
locate any literature discussing impact of timeline. Also, we
could not find literature conclusively quantifying Facebook
events and their impact on social events. We also did not
locate any literature which can explain user bias in sharing
fake informat ion knowingly. We understand that social
networking phenomenon is relatively new and hence there is
no enough research done on every aspect of social network’s
impact on our real time interactions.
VIII. CONCLUSION
In this literature survey, we have identified factors that need
to be accounted while modelling informat ion spread on social
networks. We have avoided going into details of mathemat ical
details supporting conclusions derived for simplicity. We have
linked various papers that is available on this topic to identify
following conclusions.
On network structure side, we conclude that social network
from friendship perspective demonstrates characteristics of
both scale-free and small-world networks. But since,
interactions between users which are time constrained, display
only scale-free characteristics, we need to model social
network as scale-free network for our research perspective.
We conclude that following factors should be taken into
account by our model which will impact likelihood and
velocity of message spread.
(1) Number of friends of user is inversely proportional to
amount of influence of friend has on user
(2) Number of friends of user is inversely proportional to
lifetime of message to remain active in user’s news feed
(3) Amount of time user spends on average on Facebook is
directly proportional to likelihood of spreading message
(4) Stronger bond with sender is directly proportional to
likelihood of spreading message further
(5) More is the clustering in user’s network, less is the
velocity of message to spread, primarily due to
duplication of messages it will remain confined to same
cluster
(6) Message shared at peak time will have less lifetime on
news feed but higher likelihood to get replicated due to
high activity in entire cluster
(7) If users perceive particular message as no harmful to
them, then there is higher likelihood that it will be spread
or shared, irrespective of user’s analysis of message’s
authenticity. This will be typical sharing of such
messages for amusement or political conflicts.
REFERENCES
[1] Katona Z., Zubcsek P., Sarvary M., Network Effects and
Personal Influences: The Diffusion of an Online Social
Network , Journal of Marketing Research, Vol. XLVIII
(June 2011), 425-443, American Marketing Association.
[2] Mislove A., Marcon M., Gummadi K., Druschel P.,
Bhattacharjee B., Measurement and Analysis of Online
Social Networks, proceedings of IMC’07, ACM.
[3] Yixiao Li, Xiaogang Jin, Fansheng Kong and Jiming Li,
Linking via Social Similarity: The Emergence of
Community Structure in Scale-free Network , IEEE
symposium on digital object identifier, 2009.
[4] Wei Ren, Jianping Li, A fast algorithm for simulating
scale-free networks, proceedings of ICCTA2009
[5] Ted G. Lewis, Network Science: Theory and Practice,
John Wiley & Sons, Inc. 2009.
[6] P. Erdos, A. Renyi, On the evolution of random graphs,
Publ. Math. Inst. Hung. Acad. Sci., vol. 5, pp. 17-60,
1959.
[7] Goel S., Muhamad R., Watts D., Social Search in
“Small-World” Experiments, proc. WWW 2009 , ACM.
[8] Jun T., Sethi R., Reciprocity in evolving social networks,
Journal of Evolutionary Economics , June 2009.
[9] Harvey C., Stewart D., Ewing M., Forward or delete:
What drives peer-to-peer message propagation across
social networks?, Journal of Consumer Behavior, Vol.
10, 2011, Published by Wiley.
[10] Norman AT, Russell CA. 2006. The Pass-Along Effect:
Investigating Word-of-Mouth Effects on Online Survey
Procedures. Journal of Computer-Mediated
Communication 11(4): 1085–1103.
[11] Wilson C., Boe B., Sala A., Puttaswamy P., Zhao B.,
User Interactions in Social Networks and their
Implications, Proceedings of EuroSys 2009, ACM.
[12] Skoric M., Poor N., Liao Y., Wei S., Online
Organization of an Offline Protest: From Social to
Traditional Media and Back , proceedings of HICSS
2011, retrieved from IEEE.
[13] Magnani M., Montesi D., Rossi L., In formation
propagation analysis in a social network site,
proceedings of International Conference on Advances in
Social Networks Analysis and Mining, 2010, IEEE.
[14] Gao H., Hu J., Wilson C., Li Z., Chen Y., Zhao B.,
Detecting and Characterizing Social Spam Campaigns ,
proceedings of IMC’10. ACM.