weibo, and a tale of two worlds - university of windsorjlu.myweb.cs.uwindsor.ca/asonam2015lu.pdf ·...
TRANSCRIPT
Weibo, and a Tale of Two Worlds
Wentao Han1, Xiaowei Zhu1, Ziyan Zhu1, Wenguang Chen1, Weimin Zheng1,Jianguo Lu 2
1Tsinghua University, China
2University of Windsor, Canada
Tshinghua University and University of Windsor
What is Weibo?
Tshinghua University and University of Windsor
User activities of Twitter (42 million) and Weibo (37 million).
User activities of Twitter (42 million) and Weibo (37 million).
Asia
N. Korea
S. Korea
Maldives
Australia
Europe
North America
Weibo and Twitter
Weibo and Twitter platforms
Weibo TwitterMessage length 140 140
Language Chinese (mostly) ManyDirected network yes yes
2000 out-link limit yes yes (with exceptions)
Weibo has hundreds of millions of (registered/active) users.
Mostly disjoint from Twitter users geographically
Not well studied
Only a few studies based on small samplesCompare: two complete Twitter networks were studied (the 2009 network and2014 network)
Tshinghua University and University of Windsor
Goals
Whether are the two worlds similar?
Do the people in the two worlds interact in a similar way?
Whether the world represented by Weibo can reflect the real world?
Tshinghua University and University of Windsor
Experiment
We crawled an almost complete user network
exhaustively follow the out-links;collect users who has at least one in-linkbetween November 2012 - February 2013
282 million nodes
27 billion links
Why is it a complete network?
Few new users could be crawled (689 duplicates to obtain one new user)
Consistent with the official data on the number of registered user
Official registered user size is 500 million;We carried out a uniform random sampling, and find that 40% of the usershave no in-links;282 million are all the users who have in-links
Tshinghua University and University of Windsor
Twitter 2009 data
Publicly available
Both are networks evolved after three years of inception
H. Kwak, C. Lee, H. Park, and S. Moon. What is twitter, a social network ora news media? In WWW, pages 591 - 600. ACM, 2010.
#Nodes #Links Mean Max Std CV Simpson Gini Path SCC(×106) (×109) degree (×106) (×10−4) length (%)
Weibo 222 27 121 25 9028 74 0.25 0.88 3.44 92Twitter 41 1.4 35 2.9 2419 69 1.13 0.83 4.12 80
Table: Statistics of Weibo and Twitter 2009 data. Max, standard deviation (Std),coefficient of variation (CV), Simpson index and Gini index are for in-degrees.
Tshinghua University and University of Windsor
Degrees
In-degree distribution
In-degrees resemble a power law
Exponent is around -1
Degree vs. Rank plot
This degree/rank plot is good for viewing the top bloggers
Not clear for bloggers with small degrees
Tshinghua University and University of Windsor
Degrees
In-degree distribution (freq. vs. degree plot)
For frequency/degree plot, the exponent is around -2 (plus -1 of the previousplot)
Good to view the bloggers with small degrees
Spammers buy in-links in bulks of 10K and 50K.
Tshinghua University and University of Windsor
Degrees
Out-degrees
There are spikes around 2000out-degree limit, for both Weiboand Twitter
Weibo impose more rigid 2000 limit
Tshinghua University and University of Windsor
Degrees
Degrees can tell more information ...
Your friends are richer than you are
Your friends have more friends than you doWeibo: 5567 times moreTwitter: 4761 times moreThis is the coefficient of degree variation
Diversity index: the probability of two links pointing to the same user
Weibo: 0.25 × 10−4
Twitter: 10−4
Gini coefficient, inequality of the in-degrees
Weibo: 0.88Twitter 0.83Higher than the inequality of wealth in any economy
Tshinghua University and University of Windsor
Degrees
Celebrities socialize with celebrities?
Intuitively, yes.
Pearson correlation coefficient between the in-degrees of the followingrelationship is negative.
A tempting conclusion is that popular users tend to follow unpopular usersSurprising result claimed by Myers et al.
S. A. Myers, A. Sharma, P. Gupta, and J. Lin. Information network or socialnetwork?: the structure of the twitter follow graph. In WWW, pages 493 - 498,2014.
Tshinghua University and University of Windsor
Degrees
Celebrities do tend to follow celebrities
When the relation is not linear, Pearson correlation coefficient is not enough
’rich’ follows ’rich’, but ’poor’ also follows ’rich’
Tshinghua University and University of Windsor
Degrees
Possible explanation
Users start by following celebrities (high target degree)
then normal users (target degree drops)
Celebrities do not follow many normal users
Why both Weibo and Twitter bottom at 2000?
Tshinghua University and University of Windsor
Reciprocity
Reciprocity
Reciprocity for most users who have less than 100 in-/out-degrees.Weibo is much lower than Twitter
Tshinghua University and University of Windsor
Reciprocity
Reciprocity
Overall view of reciprocity for all kinds of people.
Twitter has a much higher reciprocity;
Their reciprocity fluctuates in a similar pattern;
Privileged Twitter users (out − degree > 2000) reciprocate almost everyfollower
Tshinghua University and University of Windsor
Clustering
clustering coefficient
CC is slightly higher in Twitter
Tshinghua University and University of Windsor
Clustering
clustering coefficient
Hypothesis
CC is higher for more economically developed regions
Tshinghua University and University of Windsor
Clustering
top bloggers
107.1
107.2
107.3
107.4
10−3
WeiboAssistant
Weibo Customer ServiceYao Chen
He Jiong
Xie Na
Dee Hsu
Wang Leehom
Zhao Wei
Yang Mi
Kevin Tsai
Zhou Libo
Wen ZhangChen Kun
Barbie Hsu
Kai−Fu Lee
Ruby Lin
Christine Fan
Li Bingbing
Weibo Android Client
Guo Degang
PageR
ank
degree
PageRankProportional to degree
Tshinghua University and University of Windsor
Clustering
pagerank vs. in-degree
107
10−3
PageR
ank
degree
(A)
106
10−3
PageR
ank
degree
(B)
106
107
10−6
10−4
PageR
ank
degree
(C)
105
106
10−5
10−4
10−3
PageR
ank
degree
(D)
Tshinghua University and University of Windsor
Clustering
weibo adoption rate
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
10000
30000
50000
70000
Accounts per capita
GD
P(C
NY
) p
er
ca
pita
AH
BJ
CQ
FJ GD
GS
GX
GZ
HA
HB
HE
HIHLHN
JL
JS
JX
LN
NM
NX
QH
SC
SD
SH
SN
SX
TJ
XJ
XZ
YN
ZJ
Tshinghua University and University of Windsor
Clustering
Connections between regions
−0.05
0
0.05
0.1
0.15
GD AB HK MO FJ TW BJ TJ HE SX SD NM LN JL HL HA XZ QH SN GS NX XJ JX HB HN HI GX OT CQ SC GZ YN SH JS ZJ AH
GD
AB
HK
MO
FJ
TW
BJ
TJ
HE
SX
SD
NM
LN
JL
HL
HA
XZ
QH
SN
GS
NX
XJ
JX
HB
HN
HI
GX
OT
CQ
SC
GZ
YN
SH
JS
ZJ
AH
closet pairs: ChongQing vs SiChuan, Tibet vs QingHai, HongKong vs Macau vsGuangDong
Tshinghua University and University of Windsor
Clustering
Connections between regions
GDAB HKMO FJ TW BJ TJ HE SX SDNMLN JL HL HA XZ QHSNGSNX XJ JX HBHN HI GX OTCQSC GZ YN SH JS ZJ AH0.4
0.42
0.44
0.46
0.48
0.5
0.52
result of running hierarchical agglomerative clustering
Tshinghua University and University of Windsor
Clustering
mutual information of HK and TW
0 5 10 15 20 25 30 35 40−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
0.25
TW
HK
normalized point-wise MI: zero means the strength is the same as in a randomgraph
Tshinghua University and University of Windsor
Clustering
Take away messages
We have the largest OSN user network available for research;
Similarities and differences between the two worlds:
degree variation is similarreciprocity
Connection to the real world
quantify the connection strength between regions
Tshinghua University and University of Windsor