social computation of emergent networks on user generated content
DESCRIPTION
invited talk given at the "Web-Science" workshop at Informatik 2010, Leipzig, GermanyTRANSCRIPT
Knowledge Management Institute
Social Computation of EmergentSocial Computation of Emergent Networks on User-Generated Content
GI Workshop on “Web-Science” at Informatik 2010 der 40 Jahrestagung der Gesellschaft für InformatikInformatik 2010, der 40. Jahrestagung der Gesellschaft für Informatik
Leipzig, Germany
Markus StrohmaierMarkus StrohmaierAssistant Professor
Knowledge Management Instituteg gGraz University of Technology, Austriae-mail: [email protected]
web: http://www.kmi.tugraz.at/staff/markus
1
Markus Strohmaier 2010
Knowledge Management Institute
Social-Computational Systems… is the title of a new National Science Foundation (NSF) Program.
the genesis of a new class of computational systems,
( ) g
which generate emergent behaviors that arise out of the complex and dynamic interactions among people and computers.
Source: National Science Foundation http://www.nsf.gov/pubs/2010/nsf10600/nsf10600.htm
3 observations:• Rise of User Generated Content
p g p
• 5 out of the top 10 websites in the world have a focus on user-generated-content (Alexa.com 2010)
• Rise of Online Social Networks– More than 500 million active Facebook users, 50% log on any given day (Facebook 2010)
• Integration of user data and system functionality• User data becomes an integral part of system functions
2
Markus Strohmaier 2010(Facebook 2010) https://www.facebook.com/press/info.php?statistics
Knowledge Management Institute
Social Computational Systems
Interaction between individuals andcomputational systems
is mediated by the aggregate behavior ofy gg gusers.
3
Markus Strohmaier 2010
Knowledge Management Institute
Social Computation pinfluences system properties (X)
X=UtilityX=Findability
It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, that system properties and functions emerge.
X Navigability X R lX=Navigability X=Relevance
4
Markus Strohmaier 2010
Knowledge Management Institute
System Properties ofSystem Properties of Social-Computational Systems
• Findability: • the ease at which a document can be found by a user
U ili• Utility:• the degree to which a system maximizes usefulness of its functions for users
• Navigability: th t hi h i t f A t B• the ease at which a user can navigate from A to B
• Relevance: • the extent to which offered information is considered relevant
• Privacy:• Privacy: • the extent to which private information is kept private
• Profit:• The extent to which functions can be monetizedThe extent to which functions can be monetized
• …influenced by social computation processes
5
Markus Strohmaier 2010
Knowledge Management Institute
Agenda
1 S i l C t ti l S t1. Social-Computational Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
6
Markus Strohmaier 2010
Knowledge Management Institute
Agenda
1 S i l C t ti l S t1. Social-Computational Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
7
Markus Strohmaier 2010
Knowledge Management Institute
Example:Example:X = Connectivity (of the web graph)
Questions:• What is X like? • What causes X?
bow-tie architectureof the web
8
Markus Strohmaier 2010
[Broder et al 2000]
Knowledge Management Institute
Example:Example:X = Connectivity (of the web graph)
Questions:• What is X like? • What causes X? • How can we
improve X?bow-tie architectureof the web
Social mechanisms, such as preferential attachment
an open issuep
9
Markus Strohmaier 2010
[Broder et al 2000] [Barabasi 1999]
Knowledge Management Institute
Social Computational Systems:Social Computational Systems:What type of questions are we asking?
X C ti it f th b h
• Description and Classification:• What is X like?
• Causality:• Does X cause Y?
e.g. X = Connectivity of the web graph
• What are its properties?• How can it be categorized?• How can we measure it?
• Descriptive Process:
• Does X prevent Y?• What causes X?• What effect does X have on Y?
• Causality Comparative:• Descriptive Process:• How does X work?• What is the process by which X
happens?
• Causality - Comparative:• Does X cause more Y than does Z?• Is X better at preventing Y than is Z?• Does X cause more Y than does Zpp
• How does X evolve?
• Descriptive Comparative:• How does X differ from Y?
Does X cause more Y than does Z under one condition but not others?
• Design• What is an effective way to achieve X?
• Relationship:• Are X and Y related?• Do occurences of X correlate with
y• How can we improve X?
10
Markus Strohmaier 2010
occurences of Y?
Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software Engineering Research", Guide to Advanced Empirical Software Engineering, 2007
cf. [Easterbrook 2007 et al.]
Knowledge Management Institute
Attempting a Definition:Attempting a Definition:Social-Computational Systems
…refer to systems in which essential system properties and functions (“X”) are influenced by the behavior of users.
Thus, certain system properties and functions are not engineered by a single person, but they are emergent, i.e. the result of aggregating information from a large group of usersaggregating information from a large group of users.
In this sense, certain system properties and functions of social-i l b d h di l fcomputational systems are beyond the direct control of system
designers.
New approaches for designing and shapingsocial-computational systems are needed.
11
Markus Strohmaier 2010
Knowledge Management Institute
The Dual Nature of Web-Science
Science Engineering
typically beyond
What is X like?Improve X? Prevent Y?
beyond control
emergent social-computational f
social computation =social behavior + algorithmic computation
through aggregation
system properties and functions
12
Markus Strohmaier 2010
Knowledge Management Institute
Social Computational Systems:Social Computational Systems:What type of questions are we asking?
• Description and Classification:• What is X like?
• Causality:• Does X cause Y?
• What are its properties?• How can it be categorized?• How can we measure it?
• Descriptive Process:
• Does X prevent Y?• What causes X?• What effect does X have on Y?
• Causality Comparative:• Descriptive Process:• How does X work?• What is the process by which X
happens?
• Causality - Comparative:• Does X cause more Y than does Z?• Is X better at preventing Y than is Z?• Does X cause more Y than does Z
Today‘s talk:X1=NavigabilityX2=Semanticspp
• How does X evolve?
• Descriptive Comparative:• How does X differ from Y?
Does X cause more Y than does Z under one condition but not others?
• Design
X2 Semanticsof User-Generated Content
• Relationship:• Are X and Y related?• Do occurences of X correlate with
Design• What is an effective way to achieve X?• How can we improve X?
14
Markus Strohmaier 2010
occurences of Y?
Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software Engineering Research", Guide to Advanced Empirical Software Engineering, 2007
cf. [Easterbrook 2007 et al.]
Knowledge Management Institute
Agenda
1 S i l C t ti l S t1. Social-Computational Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
15
Markus Strohmaier 2010
Knowledge Management Institute
X1=Navigability1 g y
Question:How can we Measure and Improve
N i bilit i S i l T i S t ?Navigability in Social Tagging Systems?
Tag clouds as an instrument fornavigation
16
Markus Strohmaier 2010
g
Knowledge Management Institute
Tag Clouds are Supposed to be EfficientTag Clouds are Supposed to be Efficient Tools for Navigating Tagging Systems
The Navigability Assumption:• An implicit assumption among designers of social taggingAn implicit assumption among designers of social tagging
systems that tag clouds are specifically useful to support navigation.
• This has hardly been tested or critically reflected in the pastThis has hardly been tested or critically reflected in the past.
Navigating tagging systems via tag clouds:1) The system presents a tag cloud to the user.) y p g2) The user selects a tag from the tag cloud.3) The system presents a list of resources tagged with the
selected tagselected tag.4) The user selects a resource from the list of resources.5) The system transfers the user to the selected resource,
d th t ti ll t t
17
Markus Strohmaier 2010
and the process potentially starts anew.
Knowledge Management Institute
Navigability of Social Tagging SystemsQuestion: How does(i) th i f t l d d(i) the size of tag clouds and(ii) number of resources / tag influence the navigability (X1) of social tagging systems?
established systems,many users
New system,few usersfew users
18
Markus Strohmaier 2010
Knowledge Management Institute
Defining Navigability
A network is navigable iff:There is a path between all or almost all pairs of nodes
i th t kin the network.
Formally:Formally:1. There exists a giant component2 The effective diameter is low (bounded by log n)2. The effective diameter is low (bounded by log n)
19
Markus Strohmaier 2010
J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)
Knowledge Management Institute
Navigability: Examples
Example 1:
Not navigable: No giant component
Example 2:
Not navigable: giant component BUTNot navigable: giant component, BUTavg. shortest path > log2(9)
20
Markus Strohmaier 2010
Knowledge Management Institute
Navigability: Examples
Example 3:
Navigable: Giant component AND avg shortest path ≤ 2 < log (9)avg. shortest path ≤ 2 < log2(9)
Is this efficiently navigable?Is this efficiently navigable? There are short paths between all nodes, but can an
agent or algorithm find them with local knowledge
21
Markus Strohmaier 2010
only?
Knowledge Management Institute
Efficiently navigable
A network is efficiently navigable iff:If there is an algorithm that can find a short path with
l l l k l d ( ith b hi f t k) donly local knowledge (with branching factor k), andthe delivery time of the algorithm is boundedpolynomially by logk(n).polynomially by logk(n).
Example 4:B
p
A C
Efficiently navigable, if the algorithm knows it needs togo through A B C
22
Markus Strohmaier 2010J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)
Knowledge Management Institute
User Interface constraints
Tag Cloud Size nn: number of tags
shown per tag cloudshown per tag cloud
(topN most common algorithm)
Pagination of resources / tagPagination of resources / tagk: number of resources
shown per page
(reverse chronological ordering)
23
Markus Strohmaier 2010
Knowledge Management Institute
How UI constraints effect NavigabilityTag Cloud Size
Pagination
Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising).
BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches
24
Markus Strohmaier 2010
vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds.
Knowledge Management Institute
Findings1 F t i ifi b t l t l d i th1. For certain specific, but popular, tag cloud scenarios, the
so-called Navigability Assumption does not hold. 2. While we could confirm that tag-resource networks have g
efficient navigational properties in theory, we found that popular user interface decisions significantly impair navigabilitynavigability.
These results make a theoretical and an empirical argument against existing approaches to tag cloud construction.
How can we improve the navigability of social taggingHow can we improve the navigability of social tagging systems?
25
Markus Strohmaier 2010
Knowledge Management Institute
Recovering Navigability in Social TaggingRecovering Navigability in Social Tagging Systems
Instead of reverse-chronological ordering of resources, we apply a random ordering.
26
Markus Strohmaier 2010
Knowledge Management Institute
Efficient Navigability in Social TaggingEfficient Navigability in Social Tagging Systems
Instead of random ordering, we use hierarchicalbackground knowledge for ranking paginatedresources [Kleinberg 2001].
27
Markus Strohmaier 2010J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press, 2001, p. 2001.
Knowledge Management Institute
Social Computational SystemsSocial Computational Systems Implications
• Navigability in social tagging systems is an emergent system propertysystem property
S f i iti l i t iti b t i bilit (t• Some of our initial intuitions about navigability (tag clouds) are wrong
• The UI represents an opportunity to influence emergent system propertiesemergent system properties
28
Markus Strohmaier 2010
Knowledge Management Institute
Agenda
1 S i l C t ti l S t1. Social-Computational Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
29
Markus Strohmaier 2010
Knowledge Management Institute
X1=Semantics
Question:How can we Measure and Influence
Emergent Semantics in Social Tagging S t ?Systems?
30
Markus Strohmaier 2010
Knowledge Management Institute
Emergent Semantic Structures
31
Markus Strohmaier 2010 Lerman et al 2010
Knowledge Management Institute
Pragmatics influence emergent properties
M ti ti f T i Ki d f TMotivations for Tagging
• Future Retrieval
Kinds of Tags
• Content-based• Future Retrieval• Contribution and Sharing• Attracting Attention (Flickr)
• Content-based• Context-based• Attribute Tags
• Play and Competition (ESP Game)
• Self Presentation
• Ownership Tags• Subjective Tags• Organizational Tags
This suggests that …emergent semantics are influenced by the underlying motivation for tagging( f f l [H k 2009])
• Opinion Expression• Task Organization (“toread”)• Social Signalling (“for:scott”)
Organizational Tags• Purpose Tags• Factual Tags
P l T
(cf. for example, [Heckner 2009])
• Social Signalling ( for:scott )• Money (Amazon Mechanical
Turk)
• Personal Tags• Self-referential tags• Tag Bundles
32
Markus Strohmaier 2010
• Categorization / Description
Gupta et al. 2010
g
Knowledge Management Institute
Why Do Users Tag?
O ( f )One (of many) answers: To categorize or to describe resources
Categorizer (C) Describer (D)Categorizer (C) Describer (D)
Goal later browsing later searchChange of vocabulary costly cheapSi f b l li it d OSize of vocabulary limited OpenTags subjective objective
Example tag clouds
Semantic Assumption: Categorizers produce more precise emergent semantics than Describers.
33
Markus Strohmaier 2010M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.
Knowledge Management Institute
Measures for
C t i /D ib
Measures for Tagging Pragmatics vs. Tag Semantics
S tiCategorizer/Describer:
• Size of tag vocabulary
Semantics: [Cattuto et al 2008]
• Co-occurrence count
• Tags per resource • Cosine similarity (TagCont)
• Tags per post • FolkRank
• Orphaned tags[Hotho et al 2006]
34
Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
Knowledge Management Institute
Experimental Setup
A d t t dAs a dataset, we used• a crawl from Delicious (University of Kassel)
• from November 2006 (containing 667,128 users)
• 10.000 most common tags, minimum of 100 resources / user
For semantic grounding, we used• WordNet as a knowledge base (cf. [Cattuto et al. 2008])
• Jiang-Conrath as a measure of similarity• combines the taxonomic path length between to nodes in WordNet with an information-theoretic similarity measure [Jiang and Conrath 1997]
• A WordNet library as an implementation • by [Pedersen et al 2004]
35
Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
Knowledge Management Institute
ResultsDescribers outperform categorizers on precision of
emergent tag semantics
Categorizers perform worse than random
Describers perform better than random
Random users
Random users
worse
better
DescribersCategorizers
36
Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.
Knowledge Management Institute
Social Computational SystemsSocial Computational Systems Implications
• Semantics in social tagging systems is an emergent system propertysystem property
S f i iti l i t iti b t ti• Some of our initial intuitions about semantics are wrong • describers outperform categorizers on a particular taskdescribers outperform categorizers on a particular task
• User behavior influences emergent system propertiesUser behavior influences emergent system properties
37
Markus Strohmaier 2010
Knowledge Management Institute
Agenda
1 S i l C t ti l S t1. Social-Computational Systems
2. Navigability of Social-Computational Systems
3. Semantics in Social-Computational Systems
4. Social-Computational Systems & the Future
38
Markus Strohmaier 2010
Knowledge Management Institute
Social-Computational Systems:Social-Computational Systems:Conclusions
1. Certain properties of social computational systems (such as navigability or semantics) are emergent properties, they are g y ) g p p , ybeyond the direct influence of system designers
2. The user interface is an opportunity to influence these emergent propertiesproperties
3. If user motivation or behavior changes over time, system properties may change.
It is through the process of social computation, i.e.It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, that system properties and functions emerge.
39
Markus Strohmaier 2010
Knowledge Management Institute
Web-Science: A Call to Action
As web scientists, we need to• study and map the complex relationships between user behaviorstudy and map the complex relationships between user behavior,
user interfaces and emergent properties• understand the potentials and limits of influencing emergent
t tisystem properties
As web engineers, we need toAs web engineers, we need to• shift perspective away from designing towards shaping social-
computational systems• reconcile user behaviors with desired system properties
40
Markus Strohmaier 2010
Knowledge Management Institute
End of Presentation
Thank you!
Markus StrohmaierGraz University of Technology, Austriay gy,
in collaboration with:H.P. Grahsl, D. Helic, C. Körner, R. Kern, C. Trattner,
D. Benz, A. Hotho, G. Stumme
42
Markus Strohmaier 2010
Knowledge Management Institute
Related Publications
• Intent and motivation in social mediaM. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in SocialM. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.
• Social computation and emergent structuresC. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Arise From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.April 26 30, ACM, 2010. D. Helic, C. Trattner, M. Strohmaier and K. Andrews, On the Navigability of Social Tagging Systems, The 2nd IEEE International Conference on Social Computing (SocialCom 2010), Minneapolis, Minnesota, USA, 2010.
• Knowledge acquisition from social mediaC. Wagner, M. Strohmaier, The Wisdom in Tweetonomies: Acquiring Latent Conceptual Structures from
Social Awareness Streams, Semantic Search 2010 Workshop (SemSearch2010), in conjunction with the 19th
43
Markus Strohmaier 2010
International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.