preventing filter bubbles and underprovision in online communities with social curation algorithms

16
Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms: Data-driven approaches to measuring “bias” Jahna Otterbacher Open University Cyprus, Nicosia CYPRUS Libby Hemphill Illinois Institute of Technology, Chicago USA

Upload: jahnaotterbacher

Post on 30-Oct-2014

142 views

Category:

Technology


0 download

DESCRIPTION

In ongoing work, we are experimenting with data-driven approaches to support communication and socialization processes in online communities. Specifically, our work addresses the problems that often arise due to the use of automated methods for curating participants' contributions. Simple binary constructs such as “helpful,” “like,” or “thumbs up/down” have become dominant mechanisms for organizing user-contributed content. Readers' feedback on collected content is aggregated and used to create information displays, ranking content by attributes such as what is “most helpful” or “most liked.” The result is a curated collection of participants’ contributions. Given the tendency for people to access items in the order of presentation and to satisfice rather than satisfy their needs for information, the aforementioned curation algorithms largely determine what information participants are exposed to (i.e., creating a filter bubble). We are undertaking a systematic investigation of communities that employ such mechanisms in order to better understand how curation algorithms impact users. Existing research suggests social voting mechanisms have unintended consequences (e.g., that there is little turnover in what is “most helpful,” even when new, high-quality content is added, that some kinds of content are consistently hidden because they receive few votes), and we are studying the effects of those consequences: how do user perceptions and behavior change based on the information shown (or hidden)? How does the information shown (or hidden) influence what information users contribute? Do curation algorithms display homogeneous information that alienates underrepresented users? The many possible combinations of features of users, content, and information displays present a complex problem for automated curation. The research problem is fundamentally socio-technical, and our project employs a multi-method approach that focuses on four characteristics of the algorithms and users: (1) contributor characteristics (e.g., gender, reputation), (2) content characteristics (e.g., writing style, key words), (3) the perceived value of curated content (e.g., “helpful” votes received), and (4) the presentation algorithm(s) (e.g., reverse helpfulness rank). We are conducting automated analyses of the content and its presentation and are planning a survey of users. Our study includes multiple communities in three different domains (health, entertainment, and news). We hope to identify the conditions under which contributions and/or contributors that exhibit certain properties are systematically ranked lower (or higher) than others and how the information displayed impacts user behavior.

TRANSCRIPT

Page 1: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms:

Data-driven approaches to measuring “bias”

Jahna Otterbacher Open University Cyprus, Nicosia CYPRUS

Libby Hemphill Illinois Institute of Technology, Chicago USA

Page 2: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Social Curation Algorithms in Online Communities

• Low barriers to entry• Users contribute to a collection of shared content• Users judge the value of content via binary voting• Aggregated votes used in information display(s)

Aarhus University, 3 October 2013

Page 3: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Aarhus University, 3 October 2013

Page 4: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Aarhus University, 3 October 2013

Page 5: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Bias• Content with particular properties systematically ranked

higher/lower than others

• Information display gives users a particular take on “what others think”

• Prominently displayed content is what users see and read• Users often do not change default settings• They place trust in information displays

Aarhus University, 3 October 2013

Page 6: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Gender bias at IMDb

Aarhus University, 3 October 2013

Page 7: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Editing bias at Amazon, IMDb and Yelp

Aarhus University, 3 October 2013

Page 8: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Underprovision problem• When social curation is used:

“too many people rely on others to contribute without doing so themselves.” [Gilbert, 2013]

• Study of Reddit• Most communities suffer from some degree of free riding• At Reddit, users’ contributions being buried led to disincentives for

contributions• “…it’s such an incredible resource when the comments are flowing,

but if your post gets buried for whatever reason, it’s painfully anti-climactic.”

Aarhus University, 3 October 2013

Page 9: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Our perspective• Bias is inevitable and is not necessarily bad

• Presence of bias could be revealed to users

• Research questions• What types of biases may occur? • Under what circumstances?• How can we study bias across systems?

Aarhus University, 3 October 2013

Page 10: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Proposed framework• Find diverse examples of systems• Taxonomy of biases• Participation rates and participant roles• Examine correlations between system/participant

characteristics and observed biases• Generate ideas of how to respond

Aarhus University, 3 October 2013

Page 11: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Aarhus University, 3 October 2013

Page 12: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Bias taxonomy• Contributor characteristics

• Demographics• Level, type of activities• Information disclosure

• Contribution characteristics• Writing style (e.g., narrative/reporting)• Content (e.g., uniqueness/conformity)• Metadata (e.g., time posted)

Aarhus University, 3 October 2013

Page 13: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Participation rates & roles

Aarhus University, 3 October 2013

Page 14: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Correlations• How are system and participant characteristics correlated

to the biases that we observe?

• Are more information displays necessarily better?• Which default display leads to more/less diversity with

respect to a given characteristic of content?

Aarhus University, 3 October 2013

Page 15: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Final thoughts• Can we exploit bias in order to

• Entice users to participate in all activities?• Convince users to question default information displays?

Aarhus University, 3 October 2013

Page 16: Preventing Filter Bubbles and Underprovision in Online Communities with Social Curation Algorithms

Thank [email protected]

Aarhus University, 3 October 2013