today: web – millions of users
DESCRIPTION
Today: Web – Millions of users. Web is like a “laboratory” for studying millions of people at once. Users leave detailed traces of their social activity. Large on-line applications with hundreds of millions of users. Connecting the dots…. - PowerPoint PPT PresentationTRANSCRIPT
2
Today: Web – Millions of users
Large on-line applications
with hundreds of millions of
users
Users leave detailed traces
of their social activity.
Web is like a “laboratory” for
studying millions of people at once.
3
Connecting the dots…
As we connect the dots into a network patterns emerge...
4
Navigating the world’s social netowrk
The network: 180M people, 1.3B undirected edges
5
The 6 degrees of separation Small-world experiment [Milgram ‘67]
64 letters are sent-forward from Nebraska to Boston How many steps does it take?
Average path length is 6.26 degrees of separation
6
Microsoft Instant Messenger(180M people, 1.3B undirected edges, )
Number of steps between 180
billion pairs of people
Avg. path length 6.690% of the people can be reached in
< 8 hops
7
Use Machine Learning to find the target person
Green bar is prob. that node is good
Information propagation on the web
What does the web talk about? 1.6 million news media and blog sites 1. million articles a day What do they talk about? Who is imitating/copying whom?
10
Info propagation on the web
News media writes articles and refer (link) to other articles and the information spreads
Can track the information as it spreads and mutates over millions
of websites
11
Question…
= I have 10 minutes. Which news sites should I read to be most up to date?
= Who are the most influential bloggers?
?
12
Problem: Covering blogs= Given a budget (e.g., of 3 blogs)= Select blogs to cover the most of
the blogosphere?= Bad news: Solving this
exactly is NP-hard= Good news: Theorem:
Can do it in linear time and within factor 3 of optimal
Blogosphere
“topics”
So, who is influential?What should I read?
k Score Blog Posts InLinks OutLinks1 0.13 http://instapundit.com 4593 4636 52552 0.18 http://donsurber.blogspot.com 1534 1206 34953 0.22 http://sciencepolitics.blogspot.com 924 576 27014 0.26 http://www.watcherofweasels.com 261 941 36305 0.29 http://michellemalkin.com 1839 12642 63236 0.32 http://blogometer.nationaljournal.com 189 2313 92727 0.34 http://themodulator.org 475 717 49448 0.35 http://www.bloggersblog.com 895 247 102019 0.37 http://www.boingboing.net 5776 6337 618310 0.38 http://atrios.blogspot.com 4682 3205 310211 0.39 http://lawhawk.blogspot.com 1862 463 659712 0.40 http://www.gothamist.com 6223 3324 1717213 0.41 http://mparent7777.livejournal.com 25925 199 4793314 0.42 http://wheelgun.blogspot.com 1174 128 93915 0.43 http://gevkaffeegal.typepad.com/
the_alliance 302 428 2481
www.blogcascades.org