elpub 2010, helsinki, finland1 a collaborative faceted categorization system – user interactions...

15
ELPUB 2010, Helsinki, Finland 1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: [email protected]

Upload: rosanna-lindsey

Post on 17-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

ELPUB 2010, Helsinki, Finland 1

A Collaborative Faceted Categorization System – User

Interactions

Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: [email protected]

Page 2: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

ELPUB 2010, Helsinki, Finland 2

Outline

• Introduction– What is the problem we are addressing?– What is the approach we are taking?

• Facet schema evolution – user interactions– Schema enrichment– Anomaly detection– Visual schema presentation rearrangement – User feedback on classifications

• Conclusions– Future improvements

Page 3: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

ELPUB 2010, Helsinki, Finland 3

Introduction - Problem

• Problem– Navigating a large growing collection of digital objects,

particularly non textual collection such as images and photographs

• Possible Approaches– Categorize and classify the collection manually using

human experts • Centralized, expensive, single perspective• Static, rigid structure; does not evolve

– Use of social tagging systems such as flickr.com• Low precision and recall, lack of structure in tags, ambiguity

and noise in tags

Page 4: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

ELPUB 2010, Helsinki, Finland 4

Introduction - Approach

• We built a system that improves access to a large, growing collection by supporting users to build a faceted classification collaboratively

– Challenge: continuously classify new objects, modify the facet schema, and reclassify existing objects into the modified facet schema

– Collaborative Approach: Enable users to collaboratively build a schema with facets and categories, and to classify documents into this schema

• Needs automated system support to create critical mass and make it easier for users to collaborate

Page 5: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

The system front page

ELPUB 2010, Helsinki, Finland 5

Page 6: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Facet and Category Enrichment

• Statistical co-occurrence model– Subsumption

• parent-child relationship between x and y if all documents tagged with y are also tagged with x

• for an existing tagword t – identifies all documents with tag t– If these documents have a common category c, the rule of subsumption

implies that t is a possible subcategory of c

• ExampleCategory Suggested sub-categoryAmerican Civil War military life

China boxer rebellion

ELPUB 2010, Helsinki, Finland 6

Page 7: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Schema Cleansing

• Problem: – categories are created under the wrong facet– child categories might represent a broader

concept than the parent category

• Solution: – Use WordNet’s hierarchical relationships

among words to detect anomalies

ELPUB 2010, Helsinki, Finland 7

Page 8: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Schema Cleansing

Hierarchy in WordNet (hyponymy: known as “is a” relationship)dog, domestic dog, Canis familiaris

=> canine, canid

=> carnivore

=> placental, placental mammal, eutherian, eutherian mammal

=> mammal

=> vertebrate, craniate

=> chordate

=> animal, animate being, beast, brute, creature, fauna

=> ...

anomaly detection algorithm Category Parent Cat Grandparent Category Problem

President Holiday Politics more closely related to grandparent than to parent

ELPUB 2010, Helsinki, Finland 8

Page 9: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Ordering of Schema Display

• Problem:– significant number of categories are created

under a given facet (or another category)– large number of facets are created

• Solution:– limit number of child categories/facets

displayed– configure administratively– order the display by a popularity measure

ELPUB 2010, Helsinki, Finland 9

Page 10: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Ordering of Schema Display

• Popularity (P) measure– favours the biggest, most used, and fastest

growing facets and categories

P = 0.5*f(PN*PC)+ 0.5*PR• f – normalizing factor• PN - total number of items in a category• PR - growth rate of a category: number of new

(recent) items for a unit of time• PC - number of clicks on the category link in the

browsing menu over a period of time

ELPUB 2010, Helsinki, Finland 10

Page 11: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Expanding category display using the “more…” link

ELPUB 2010, Helsinki, Finland 11

Page 12: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Limiting category display using the “more…” link

ELPUB 2010, Helsinki, Finland 12

Page 13: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Quality Assessment through User Feedback

• “thumb-up” and “thumb-down” buttons available for every association– vote up or down for the association between an image and a category

based on how relevant and accurate they think it is

• Value of this explicit feedback determines when a classification can be deleted or, conversely, when it becomes “hard”, i.e., it is confirmed– Action will update the confidence value of an association by increasing

or decreasing it by 0.05 based on whether a user believes it is a correct classification or not

– confidence value reaches 1.00 -> association is hardened

– confidence value falls below a threshold -> association is deleted

ELPUB 2010, Helsinki, Finland 13

Page 14: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

Feedback on category associations

ELPUB 2010, Helsinki, Finland 14

Page 15: ELPUB 2010, Helsinki, Finland1 A Collaborative Faceted Categorization System – User Interactions Kurt Maly; Harris Wu; Mohammad Zubair ; Contact: maly@cs.odu.edu

ELPUB 2010, Helsinki, Finland 15

Conclusions

• Schema enrichment, cleansing and ordering are effective tools to remedy problems introduced by collaborative schema evolution

• Future improvements include recording actual administrator actions for training purposes