social computation of emergent networks on user generated content

41
Knowledge Management Institute Social Computation of Emergent Social Computation of Emergent Networks on User-Generated Content GI Workshop on “Web-Science” at Informatik 2010 der 40 Jahrestagung der Gesellschaft für Informatik Informatik 2010, der 40. Jahrestagung der Gesellschaft für Informatik Leipzig, Germany Markus Strohmaier Markus Strohmaier Assistant Professor Knowledge Management Institute Graz University of Technology, Austria e-mail: [email protected] web: http://www.kmi.tugraz.at/staff/markus 1 Markus Strohmaier 2010

Upload: markus-strohmaier

Post on 08-May-2015

1.495 views

Category:

Documents


0 download

DESCRIPTION

invited talk given at the "Web-Science" workshop at Informatik 2010, Leipzig, Germany

TRANSCRIPT

Page 1: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computation of EmergentSocial Computation of Emergent Networks on User-Generated Content

GI Workshop on “Web-Science” at Informatik 2010 der 40 Jahrestagung der Gesellschaft für InformatikInformatik 2010, der 40. Jahrestagung der Gesellschaft für Informatik

Leipzig, Germany

Markus StrohmaierMarkus StrohmaierAssistant Professor

Knowledge Management Instituteg gGraz University of Technology, Austriae-mail: [email protected]

web: http://www.kmi.tugraz.at/staff/markus

1

Markus Strohmaier 2010

Page 2: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social-Computational Systems… is the title of a new National Science Foundation (NSF) Program.

the genesis of a new class of computational systems,

( ) g

which generate emergent behaviors that arise out of the complex and dynamic interactions among people and computers.

Source: National Science Foundation http://www.nsf.gov/pubs/2010/nsf10600/nsf10600.htm

3 observations:• Rise of User Generated Content

p g p

• 5 out of the top 10 websites in the world have a focus on user-generated-content (Alexa.com 2010)

• Rise of Online Social Networks– More than 500 million active Facebook users, 50% log on any given day (Facebook 2010)

• Integration of user data and system functionality• User data becomes an integral part of system functions

2

Markus Strohmaier 2010(Facebook 2010) https://www.facebook.com/press/info.php?statistics

Page 3: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computational Systems

Interaction between individuals andcomputational systems

is mediated by the aggregate behavior ofy gg gusers.

3

Markus Strohmaier 2010

Page 4: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computation pinfluences system properties (X)

X=UtilityX=Findability

It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, that system properties and functions emerge.

X Navigability X R lX=Navigability X=Relevance

4

Markus Strohmaier 2010

Page 5: Social computation of emergent networks on user generated content

Knowledge Management Institute

System Properties ofSystem Properties of Social-Computational Systems

• Findability: • the ease at which a document can be found by a user

U ili• Utility:• the degree to which a system maximizes usefulness of its functions for users

• Navigability: th t hi h i t f A t B• the ease at which a user can navigate from A to B

• Relevance: • the extent to which offered information is considered relevant

• Privacy:• Privacy: • the extent to which private information is kept private

• Profit:• The extent to which functions can be monetizedThe extent to which functions can be monetized

• …influenced by social computation processes

5

Markus Strohmaier 2010

Page 6: Social computation of emergent networks on user generated content

Knowledge Management Institute

Agenda

1 S i l C t ti l S t1. Social-Computational Systems

2. Navigability of Social-Computational Systems

3. Semantics in Social-Computational Systems

4. Social-Computational Systems & the Future

6

Markus Strohmaier 2010

Page 7: Social computation of emergent networks on user generated content

Knowledge Management Institute

Agenda

1 S i l C t ti l S t1. Social-Computational Systems

2. Navigability of Social-Computational Systems

3. Semantics in Social-Computational Systems

4. Social-Computational Systems & the Future

7

Markus Strohmaier 2010

Page 8: Social computation of emergent networks on user generated content

Knowledge Management Institute

Example:Example:X = Connectivity (of the web graph)

Questions:• What is X like? • What causes X?

bow-tie architectureof the web

8

Markus Strohmaier 2010

[Broder et al 2000]

Page 9: Social computation of emergent networks on user generated content

Knowledge Management Institute

Example:Example:X = Connectivity (of the web graph)

Questions:• What is X like? • What causes X? • How can we

improve X?bow-tie architectureof the web

Social mechanisms, such as preferential attachment

an open issuep

9

Markus Strohmaier 2010

[Broder et al 2000] [Barabasi 1999]

Page 10: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computational Systems:Social Computational Systems:What type of questions are we asking?

X C ti it f th b h

• Description and Classification:• What is X like?

• Causality:• Does X cause Y?

e.g. X = Connectivity of the web graph

• What are its properties?• How can it be categorized?• How can we measure it?

• Descriptive Process:

• Does X prevent Y?• What causes X?• What effect does X have on Y?

• Causality Comparative:• Descriptive Process:• How does X work?• What is the process by which X

happens?

• Causality - Comparative:• Does X cause more Y than does Z?• Is X better at preventing Y than is Z?• Does X cause more Y than does Zpp

• How does X evolve?

• Descriptive Comparative:• How does X differ from Y?

Does X cause more Y than does Z under one condition but not others?

• Design• What is an effective way to achieve X?

• Relationship:• Are X and Y related?• Do occurences of X correlate with

y• How can we improve X?

10

Markus Strohmaier 2010

occurences of Y?

Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software Engineering Research", Guide to Advanced Empirical Software Engineering, 2007

cf. [Easterbrook 2007 et al.]

Page 11: Social computation of emergent networks on user generated content

Knowledge Management Institute

Attempting a Definition:Attempting a Definition:Social-Computational Systems

…refer to systems in which essential system properties and functions (“X”) are influenced by the behavior of users.

Thus, certain system properties and functions are not engineered by a single person, but they are emergent, i.e. the result of aggregating information from a large group of usersaggregating information from a large group of users.

In this sense, certain system properties and functions of social-i l b d h di l fcomputational systems are beyond the direct control of system

designers.

New approaches for designing and shapingsocial-computational systems are needed.

11

Markus Strohmaier 2010

Page 12: Social computation of emergent networks on user generated content

Knowledge Management Institute

The Dual Nature of Web-Science

Science Engineering

typically beyond

What is X like?Improve X? Prevent Y?

beyond control

emergent social-computational f

social computation =social behavior + algorithmic computation

through aggregation

system properties and functions

12

Markus Strohmaier 2010

Page 13: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computational Systems:Social Computational Systems:What type of questions are we asking?

• Description and Classification:• What is X like?

• Causality:• Does X cause Y?

• What are its properties?• How can it be categorized?• How can we measure it?

• Descriptive Process:

• Does X prevent Y?• What causes X?• What effect does X have on Y?

• Causality Comparative:• Descriptive Process:• How does X work?• What is the process by which X

happens?

• Causality - Comparative:• Does X cause more Y than does Z?• Is X better at preventing Y than is Z?• Does X cause more Y than does Z

Today‘s talk:X1=NavigabilityX2=Semanticspp

• How does X evolve?

• Descriptive Comparative:• How does X differ from Y?

Does X cause more Y than does Z under one condition but not others?

• Design

X2 Semanticsof User-Generated Content

• Relationship:• Are X and Y related?• Do occurences of X correlate with

Design• What is an effective way to achieve X?• How can we improve X?

14

Markus Strohmaier 2010

occurences of Y?

Selecting Empirical Methods for Software Engineering Research, Steve Easterbrook, Janice Singer, Margaret-Anne Storey, Daniela Damian, "Selecting Empirical Methods for Software Engineering Research", Guide to Advanced Empirical Software Engineering, 2007

cf. [Easterbrook 2007 et al.]

Page 14: Social computation of emergent networks on user generated content

Knowledge Management Institute

Agenda

1 S i l C t ti l S t1. Social-Computational Systems

2. Navigability of Social-Computational Systems

3. Semantics in Social-Computational Systems

4. Social-Computational Systems & the Future

15

Markus Strohmaier 2010

Page 15: Social computation of emergent networks on user generated content

Knowledge Management Institute

X1=Navigability1 g y

Question:How can we Measure and Improve

N i bilit i S i l T i S t ?Navigability in Social Tagging Systems?

Tag clouds as an instrument fornavigation

16

Markus Strohmaier 2010

g

Page 16: Social computation of emergent networks on user generated content

Knowledge Management Institute

Tag Clouds are Supposed to be EfficientTag Clouds are Supposed to be Efficient Tools for Navigating Tagging Systems

The Navigability Assumption:• An implicit assumption among designers of social taggingAn implicit assumption among designers of social tagging

systems that tag clouds are specifically useful to support navigation.

• This has hardly been tested or critically reflected in the pastThis has hardly been tested or critically reflected in the past.

Navigating tagging systems via tag clouds:1) The system presents a tag cloud to the user.) y p g2) The user selects a tag from the tag cloud.3) The system presents a list of resources tagged with the

selected tagselected tag.4) The user selects a resource from the list of resources.5) The system transfers the user to the selected resource,

d th t ti ll t t

17

Markus Strohmaier 2010

and the process potentially starts anew.

Page 17: Social computation of emergent networks on user generated content

Knowledge Management Institute

Navigability of Social Tagging SystemsQuestion: How does(i) th i f t l d d(i) the size of tag clouds and(ii) number of resources / tag influence the navigability (X1) of social tagging systems?

established systems,many users

New system,few usersfew users

18

Markus Strohmaier 2010

Page 18: Social computation of emergent networks on user generated content

Knowledge Management Institute

Defining Navigability

A network is navigable iff:There is a path between all or almost all pairs of nodes

i th t kin the network.

Formally:Formally:1. There exists a giant component2 The effective diameter is low (bounded by log n)2. The effective diameter is low (bounded by log n)

19

Markus Strohmaier 2010

J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)

Page 19: Social computation of emergent networks on user generated content

Knowledge Management Institute

Navigability: Examples

Example 1:

Not navigable: No giant component

Example 2:

Not navigable: giant component BUTNot navigable: giant component, BUTavg. shortest path > log2(9)

20

Markus Strohmaier 2010

Page 20: Social computation of emergent networks on user generated content

Knowledge Management Institute

Navigability: Examples

Example 3:

Navigable: Giant component AND avg shortest path ≤ 2 < log (9)avg. shortest path ≤ 2 < log2(9)

Is this efficiently navigable?Is this efficiently navigable? There are short paths between all nodes, but can an

agent or algorithm find them with local knowledge

21

Markus Strohmaier 2010

only?

Page 21: Social computation of emergent networks on user generated content

Knowledge Management Institute

Efficiently navigable

A network is efficiently navigable iff:If there is an algorithm that can find a short path with

l l l k l d ( ith b hi f t k) donly local knowledge (with branching factor k), andthe delivery time of the algorithm is boundedpolynomially by logk(n).polynomially by logk(n).

Example 4:B

p

A C

Efficiently navigable, if the algorithm knows it needs togo through A B C

22

Markus Strohmaier 2010J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000. Also appears as Cornell Computer Science Technical Report 99-1776 (October 1999)

Page 22: Social computation of emergent networks on user generated content

Knowledge Management Institute

User Interface constraints

Tag Cloud Size nn: number of tags

shown per tag cloudshown per tag cloud

(topN most common algorithm)

Pagination of resources / tagPagination of resources / tagk: number of resources

shown per page

(reverse chronological ordering)

23

Markus Strohmaier 2010

Page 23: Social computation of emergent networks on user generated content

Knowledge Management Institute

How UI constraints effect NavigabilityTag Cloud Size

Pagination

Limiting the tag cloud size n to practically feasible sizes (e.g. 5, 10, or more) does not influence navigability (this is not very surprising).

BUT: Limiting the out-degree of high frequency tags k (e.g. through pagination with resources sorted in reverse-chronological order) leaves the network vulnerable to fragmentation. This destroys navigability of prevalent approaches

24

Markus Strohmaier 2010

vulnerable to fragmentation. This destroys navigability of prevalent approaches to tag clouds.

Page 24: Social computation of emergent networks on user generated content

Knowledge Management Institute

Findings1 F t i ifi b t l t l d i th1. For certain specific, but popular, tag cloud scenarios, the

so-called Navigability Assumption does not hold. 2. While we could confirm that tag-resource networks have g

efficient navigational properties in theory, we found that popular user interface decisions significantly impair navigabilitynavigability.

These results make a theoretical and an empirical argument against existing approaches to tag cloud construction.

How can we improve the navigability of social taggingHow can we improve the navigability of social tagging systems?

25

Markus Strohmaier 2010

Page 25: Social computation of emergent networks on user generated content

Knowledge Management Institute

Recovering Navigability in Social TaggingRecovering Navigability in Social Tagging Systems

Instead of reverse-chronological ordering of resources, we apply a random ordering.

26

Markus Strohmaier 2010

Page 26: Social computation of emergent networks on user generated content

Knowledge Management Institute

Efficient Navigability in Social TaggingEfficient Navigability in Social Tagging Systems

Instead of random ordering, we use hierarchicalbackground knowledge for ranking paginatedresources [Kleinberg 2001].

27

Markus Strohmaier 2010J. M. Kleinberg, “Small-world phenomena and the dynamics of information,” in Advances in Neural Information Processing Systems (NIPS), 14. MIT Press, 2001, p. 2001.

Page 27: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computational SystemsSocial Computational Systems Implications

• Navigability in social tagging systems is an emergent system propertysystem property

S f i iti l i t iti b t i bilit (t• Some of our initial intuitions about navigability (tag clouds) are wrong

• The UI represents an opportunity to influence emergent system propertiesemergent system properties

28

Markus Strohmaier 2010

Page 28: Social computation of emergent networks on user generated content

Knowledge Management Institute

Agenda

1 S i l C t ti l S t1. Social-Computational Systems

2. Navigability of Social-Computational Systems

3. Semantics in Social-Computational Systems

4. Social-Computational Systems & the Future

29

Markus Strohmaier 2010

Page 29: Social computation of emergent networks on user generated content

Knowledge Management Institute

X1=Semantics

Question:How can we Measure and Influence

Emergent Semantics in Social Tagging S t ?Systems?

30

Markus Strohmaier 2010

Page 30: Social computation of emergent networks on user generated content

Knowledge Management Institute

Emergent Semantic Structures

31

Markus Strohmaier 2010 Lerman et al 2010

Page 31: Social computation of emergent networks on user generated content

Knowledge Management Institute

Pragmatics influence emergent properties

M ti ti f T i Ki d f TMotivations for Tagging

• Future Retrieval

Kinds of Tags

• Content-based• Future Retrieval• Contribution and Sharing• Attracting Attention (Flickr)

• Content-based• Context-based• Attribute Tags

• Play and Competition (ESP Game)

• Self Presentation

• Ownership Tags• Subjective Tags• Organizational Tags

This suggests that …emergent semantics are influenced by the underlying motivation for tagging( f f l [H k 2009])

• Opinion Expression• Task Organization (“toread”)• Social Signalling (“for:scott”)

Organizational Tags• Purpose Tags• Factual Tags

P l T

(cf. for example, [Heckner 2009])

• Social Signalling ( for:scott )• Money (Amazon Mechanical

Turk)

• Personal Tags• Self-referential tags• Tag Bundles

32

Markus Strohmaier 2010

• Categorization / Description

Gupta et al. 2010

g

Page 32: Social computation of emergent networks on user generated content

Knowledge Management Institute

Why Do Users Tag?

O ( f )One (of many) answers: To categorize or to describe resources

Categorizer (C) Describer (D)Categorizer (C) Describer (D)

Goal later browsing later searchChange of vocabulary costly cheapSi f b l li it d OSize of vocabulary limited OpenTags subjective objective

Example tag clouds

Semantic Assumption: Categorizers produce more precise emergent semantics than Describers.

33

Markus Strohmaier 2010M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.

Page 33: Social computation of emergent networks on user generated content

Knowledge Management Institute

Measures for

C t i /D ib

Measures for Tagging Pragmatics vs. Tag Semantics

S tiCategorizer/Describer:

• Size of tag vocabulary

Semantics: [Cattuto et al 2008]

• Co-occurrence count

• Tags per resource • Cosine similarity (TagCont)

• Tags per post • FolkRank

• Orphaned tags[Hotho et al 2006]

34

Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.

Page 34: Social computation of emergent networks on user generated content

Knowledge Management Institute

Experimental Setup

A d t t dAs a dataset, we used• a crawl from Delicious (University of Kassel)

• from November 2006 (containing 667,128 users)

• 10.000 most common tags, minimum of 100 resources / user

For semantic grounding, we used• WordNet as a knowledge base (cf. [Cattuto et al. 2008])

• Jiang-Conrath as a measure of similarity• combines the taxonomic path length between to nodes in WordNet with an information-theoretic similarity measure [Jiang and Conrath 1997]

• A WordNet library as an implementation • by [Pedersen et al 2004]

35

Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.

Page 35: Social computation of emergent networks on user generated content

Knowledge Management Institute

ResultsDescribers outperform categorizers on precision of

emergent tag semantics

Categorizers perform worse than random

Describers perform better than random

Random users

Random users

worse

better

DescribersCategorizers

36

Markus Strohmaier 2010C. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Emerge From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.

Page 36: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social Computational SystemsSocial Computational Systems Implications

• Semantics in social tagging systems is an emergent system propertysystem property

S f i iti l i t iti b t ti• Some of our initial intuitions about semantics are wrong • describers outperform categorizers on a particular taskdescribers outperform categorizers on a particular task

• User behavior influences emergent system propertiesUser behavior influences emergent system properties

37

Markus Strohmaier 2010

Page 37: Social computation of emergent networks on user generated content

Knowledge Management Institute

Agenda

1 S i l C t ti l S t1. Social-Computational Systems

2. Navigability of Social-Computational Systems

3. Semantics in Social-Computational Systems

4. Social-Computational Systems & the Future

38

Markus Strohmaier 2010

Page 38: Social computation of emergent networks on user generated content

Knowledge Management Institute

Social-Computational Systems:Social-Computational Systems:Conclusions

1. Certain properties of social computational systems (such as navigability or semantics) are emergent properties, they are g y ) g p p , ybeyond the direct influence of system designers

2. The user interface is an opportunity to influence these emergent propertiesproperties

3. If user motivation or behavior changes over time, system properties may change.

It is through the process of social computation, i.e.It is through the process of social computation, i.e. the combination of social behavior and algorithmic computation, that system properties and functions emerge.

39

Markus Strohmaier 2010

Page 39: Social computation of emergent networks on user generated content

Knowledge Management Institute

Web-Science: A Call to Action

As web scientists, we need to• study and map the complex relationships between user behaviorstudy and map the complex relationships between user behavior,

user interfaces and emergent properties• understand the potentials and limits of influencing emergent

t tisystem properties

As web engineers, we need toAs web engineers, we need to• shift perspective away from designing towards shaping social-

computational systems• reconcile user behaviors with desired system properties

40

Markus Strohmaier 2010

Page 40: Social computation of emergent networks on user generated content

Knowledge Management Institute

End of Presentation

Thank you!

Markus StrohmaierGraz University of Technology, Austriay gy,

in collaboration with:H.P. Grahsl, D. Helic, C. Körner, R. Kern, C. Trattner,

D. Benz, A. Hotho, G. Stumme

42

Markus Strohmaier 2010

Page 41: Social computation of emergent networks on user generated content

Knowledge Management Institute

Related Publications

• Intent and motivation in social mediaM. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users' Motivation for Tagging in SocialM. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010.

• Social computation and emergent structuresC. Körner, D. Benz, A. Hotho, M. Strohmaier, G. Stumme, Stop Thinking, Start Tagging: Tag Semantics Arise From Collaborative Verbosity, 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.April 26 30, ACM, 2010. D. Helic, C. Trattner, M. Strohmaier and K. Andrews, On the Navigability of Social Tagging Systems, The 2nd IEEE International Conference on Social Computing (SocialCom 2010), Minneapolis, Minnesota, USA, 2010.

• Knowledge acquisition from social mediaC. Wagner, M. Strohmaier, The Wisdom in Tweetonomies: Acquiring Latent Conceptual Structures from

Social Awareness Streams, Semantic Search 2010 Workshop (SemSearch2010), in conjunction with the 19th

43

Markus Strohmaier 2010

International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010.