user taglines: alternative presentations of expertise and interest in social media

34
ASE Social-Informatics 2012, Dec 16 Hemant Purohit 1 Alex Dow, Omar Alonso, Lei Duan, Kevin Haas 1 Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State , USA Microsoft Silicon Valley, USA “How do we know: Why are you influential?”

Upload: knoesis-center-wright-state-university

Post on 07-May-2015

1.208 views

Category:

Documents


1 download

DESCRIPTION

Web applications are increasingly showing recommended users from social media along with some descriptions, an attempt to show relevancy - why they are being shown. For example, Twitter search for a topical keyword shows expert twitterers on the side for 'whom to follow'. Google+ and Facebook also recommend users to follow or add to friend circle. Popular Internet newspaper- The Huffington Post shows Twitter influencers/ experts on the side of an article for authoritative relevant tweets. The state of the art shows user profile bios as summary for Twitter experts, but it has issues with length constraint imposed by user interface (UI) design, missing bio and sometimes funny profile bio. Alternatively, applications can use human generated user summary, but it will not scale. Therefore, we study the problem of automatic generation of informative expertise summary or taglines for Twitter experts in space constraint imposed by UI design. We propose three methods for expertise summary generation- Occupation-Pattern based, Link-Triangulation based and User-Classification based, with use of knowledge-enhanced computing approaches. We also propose methods for final summary selection for users with multiple candidates of generated summaries. We evaluate the proposed approaches by user-study using a number of experiments. Our results show promising quality of 92.8% good summaries with majority agreement in the best case and 70% with majority agreement in the worst case. Our approaches also outperform the state of the art up to 88%. This study has implications in the area of expert profiling, user presentation and application design for engaging user experience. Find paper here: http://knoesis.org/library/resource.php?id=1793

TRANSCRIPT

Page 1: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

ASE Social-Informatics 2012, Dec 16

Hemant Purohit1 Alex Dow, Omar Alonso, Lei Duan, Kevin Haas

1Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State , USA

Microsoft Silicon Valley, USA

“How do we know: Why are you influential?”

Page 2: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Outline Motivation

Problem Settings

Proposed Approach

Experiments & Results

Future Work

2 User Tagline, Social-Informatics 2012

Page 3: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Motivation

3 User Tagline, Social-Informatics 2012

?

HuffingtonPost

• Increasing use of recommended influencers and experts from social media.

• No relevance cues about who they are and why they are experts

• Need presentation of their expertise or interest areas

• Help user engagement with the content

Page 4: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Motivation

4

• I can see why …

User Tagline, Social-Informatics 2012

Senior Media Reporter for The Huffington Post

Better engagement with content!

User Data Mining

Analysis & Aggregation

Tagline

Page 5: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Problem Statement Scope:

Twitter users with English language profile descriptions and tweets

Specific problem:

Given a set of N experts E= {ei | i = 1, 2, . . . , N} in a social media community C of K (K > N) users, generate a short summary di within predefined length for each of the expert ei in the set E.

User Tagline, Social-Informatics 2012 5

Page 6: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Challenges How to extract information from user’s Twitter profile bio

that well presents the author’s expertise?

Constraint space of UI design when displaying it.

User Tagline, Social-Informatics 2012 6

Meteorologist at TWC in Atlanta . Hike, sail, snowboard, travel, take pictures and glue stuff.

Page 7: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Challenges How to generate tagline when user’s bio doesn’t

contain sufficient information?

Some profile bios are humorous.

Sometimes profile bio is simply missing

User Tagline, Social-Informatics 2012 7

Page 8: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Challenges In difference with web search document

summarization or snippet generation

Very short text segment vs. full page of web documents

People centric vs. document centric

More constraints on space due to UI design

Lack of or missing information

User Tagline, Social-Informatics 2012 8

Page 9: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Proposed Approach

Data Sampling

Candidate Taglines

Generation

Final Tagline Selection

9 User Tagline, Social-Informatics 2012

Page 10: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

What Data to Exploit?

Meformer

Self Descriptive

e.g., Profile Bio, Content of the user posts

Informer

Others describe the person

e.g., Wikipedia, IMDB, Freebase etc.

10 User Tagline, Social-Informatics 2012

Page 11: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Candidate Taglines Generation

11 User Tagline, Social-Informatics 2012

1. Occupation-Patterns

2. Link-Triangulation

3. User Classification

Page 12: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

1. Occupation-Pattern Approach

12

west coast editor for @thenextweb. geek. world changer. social good creator. community lover. #BlameDrewsCancer'er. #ihaveadream. yep, that drew.

Occupation well-represents user’s expertise areas.

Ufc Octagon Girl,Ultimate Insider Host. Model, Actress, Bud Light Spokesmodel and Fitness Nerd. I'm Not that cool, Lover not a fighter.

president and editor-in-chief of the huffington post media group. mother. sister. flat shoe advocate. sleep evangelist.

User Tagline, Social-Informatics 2012

Page 13: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Extracting Occupation Patterns Occupation titles collection

from Wikipedia and

US Dept. of Labor Statistics Reports

Given a user’s profile bio Spot the occupation titles in the N-grams of the bio

Create candidate tagline by joining N-grams in the order of their appearance in profile bio (Intuition: User’s

own described order of what he is expert for!)

Create final candidate set by those falling in the UI space constraint threshold.

User Tagline, Social-Informatics 2012 13

Page 14: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

What if there is no occupational pattern?

Can we exploit informer data?

Expert users usually have Wikipedia pages where people have written about them

Connect Twitter expert users with their Wikipedia pages.

Use Wikipedia informative summary as expert users tagline

14 User Tagline, Social-Informatics 2012

Page 15: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

User Wikipedia page’s Info-box

Twitter User

Personal Website: www.rachaelray.com

2. Link-Triangulation Approach

15 User Tagline, Social-Informatics 2012

User Identity

Resolution by cross

referenced links

Page 16: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Extracting User Description After resolving user identity corresponding to a

Knowledge-base (KB) page, exploit content of KB

Wikipedia Infobox ‘occupation’ property e.g., for celebrity TV host ‘Rachael Ray’:

Television Personality, businesswoman, author, celebrity chef

Apply restriction by UI space constraint

User Tagline, Social-Informatics 2012 16

Page 17: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Link-Triangulation Examples USER BIO WIKIPEDIA BASED TAGLINE

Tommy Chong

The legend Tommy Chong from the iconic comedy duo Cheech and Chong. Currently on the 'Get it legal' North

American tour. actor, comedian, musician

Drake Bell Thanks for following me, guys! actor, comedian, producer, musician,

singer-songwriter, director, voice artist

Steve Harvey

Proud husband, father, comedian, philanthropist and author. God has blessed me with more than I could have asked for. New Daytime TV Show premiering September

4 actor, comedian, radio and television

personality, author

bob saget

I like long walks on the beach...and then to be dragged through the sand by an off-road vehicle, and then hurled

off a catapult. actor, comedian, television host,

director, screenwriter

17 User Tagline, Social-Informatics 2012

Page 18: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

What happens if … No occupational-patterns can be extracted

No links or can’t resolve links to knowledge base

Can we classify users to categories based on their

Popularity

Activity

Content diffusion strength?

Solution: Use the classes as default taglines if nothing else

User Tagline, Social-Informatics 2012 18

Page 19: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

3. User Classes Approach

19

USER ACTIVITY

USER POPULARITY

DIFFUSION STRENGTH

Casual Celebrity

Information Hub

Passive Celebrity

Active Celebrity

Authority

Casual Twitterer Conversationalist

Disengaged Celebrity

User Tagline, Social-Informatics 2012

Page 20: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

User Classification Metrics Popularity, Activity and Diffusion Strength can be

modeled in different ways

Our formulation, for given dataset period: Popularity- #user’s mentions

Activity- #statuses

Diffusion Strength- #retweets of user tweets

50% percentile on each metrics (max normalized) as classifier to create various classes in the 3-D model

User Tagline, Social-Informatics 2012 20

Page 21: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

User Classes Examples TWITTER SCREEN-NAME

DISPLAY-NAME BIO PROFILE LINK USER-CLASS TAGLINE

Sullivan26 Steve

Sullivan EMPTY Unmatched with Wiki

Steve Sullivan is Casual Twitterer.

iamsrk

SHAH RUKH KHAN EMPTY Unmatched with Wiki

SRK is Active Celebrity on Twitter.

21 User Tagline, Social-Informatics 2012

Page 22: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Proposed Approach

Data Sampling

Candidate Taglines

Generation

Final Tagline Selection

22 User Tagline, Social-Informatics 2012

Page 23: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Principles for Quality Assessment of a Tagline Readability

‘host of Man Vs Wild & Born Survivor, family man and host of Man Vs Wild’ vs. ‘chief scout (the scout association), adventurer, explorer, author, motivational speaking, television presenter’

Specificity ‘Actor, Director’ vs. ‘Actor in Forest Gump’

Interestingness ‘CEO of Amazon, Co-founder of ABC Corp., Actor, Writer’

23 User Tagline, Social-Informatics 2012

Page 24: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Readability

Observations

Poor scores for fragmented sentences

Not suitable for measuring taglines

Skipped for the experimental stats

24

Flesch Readability Ease Scoring [0-100] - Higher the better

CANDIDATE TAGLINE-1 SCORE CANDIDATE TAGLINE-2 SCORE

singer from Norway 62.7 singer, actor, songwriter, composer, pianist 0

songwriter from Glasgow 34.5 singer, songwriter, musician 0

host of Man Vs Wild & Born Survivor, family man and host of Man Vs Wild 84.4

chief scout (the scout association), adventurer, explorer, author, motivational speaking, television presenter 0

User Tagline, Social-Informatics 2012

Page 25: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Specificity & Interestingness

• Difficult to formulate specificity and interestingness of a tagline because of subjectivity

• Modified Tf-idf approach: • query=user, relevant-docs-to-user= candidate taglines set

• Document set as all the candidate taglines of all the users

• Normalize score for a document (tagline) by its character length, w.r.t. maximum UI space constraint T (to boost the space utilization)

• Rank candidate taglines scores for a user and then select final tagline

User Tagline, Social-Informatics 2012 25

Page 26: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Specificity & Interestingness

26

Examples:

TAGLINE-1 SCORE-1 TAGLINE-2 SCORE-2 FINAL-TAGLINE

singer from Norway 0.791859 singer, actor, songwriter,

composer, pianist 1.221946 singer, actor, songwriter,

composer, pianist

songwriter from Glasgow 1.03031 singer, songwriter, musician 0.496351 songwriter from Glasgow

Ultimate Insider Host, Model, Bud Light Spokesmodel 3.179482 model, ufc octagon girl 1.429542

Ultimate Insider Host, Model, Bud Light

Spokesmodel

host of Man Vs Wild & Born Survivor 2.109237

chief scout (the scout association), adventurer,

explorer, author 4.834862

chief scout (the scout association), adventurer,

explorer, author Multi Grammy Award-winning

singer 1.423924 singer, actress, comedian,

author, producer 0.897228 Multi Grammy Award-

winning singer

User Tagline, Social-Informatics 2012

Page 27: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Evaluation Baseline: Full Twitter profile bio as a tagline

User study

Questions for user study of candidate tagline generation

How good is the generated summary of expertise presentation?

How good the original bio is by itself?

27 User Tagline, Social-Informatics 2012

Page 28: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Evaluation (contd.) Metrics for evaluation

Majority agreement on good candidate taglines: Percentage of total samples, where at least two third of judges give scores as either 1 (good) or 2 (very good) for the question Q1

Inter-rater agreement (Fleiss Kappa)

User study for final tagline selection

Q. Which one is the best summary to represent expertise description between two given candidate summaries?

Metric: % of agreeable summaries between human and automatic judgment

User Tagline, Social-Informatics 2012 28

Page 29: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Experimental Data Sets • Crawled Twitter datasets from different time periods: tweets and users

• Expert* Twitter Users set: Top 30% ranked users based on Klout score in the dataset collected

*Note: that any Expert Finding Algorithm or service can be used, our approach is independent on that.

User Tagline, Social-Informatics 2012 29

Table I. STATISTICS OF EXPERT USERS IN THREE DATA SETS

Page 30: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

User Study Results For candidate tagline generation:

*for Link-Triangulation approach we did not have a good baseline than the user profile bio

For final tagline selection:

Majority Agreement for final tagline choice between automatic and human judge selection: 65.15%

User Tagline, Social-Informatics 2012 30

Majority Agreement for good taglines

Baseline Phase-1 datasets

Phase-2 datasets

Occupation-Pattern approach 30% 74.5% 70.31%

Link-Triangulation approach *30% 91.6% 92.82%

Page 31: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Conclusion The contribution of this work

Proposed the practical approach to represent user expertise on social media as taglines given constraints of UI design.

Addressed important information extraction from user profile bio by applying occupational-pattern approach.

Addressed quality and coverage issue of tagline generation by link-triangulation approach leveraging knowledge base such as Wikipedia.

Explored user classification approach for generating taglines for users without profile bios.

Experiment results show significant improvement over baseline of the state-of-art.

User Tagline, Social-Informatics 2012 31

Page 32: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Future Work Exploit user’s web-page

Title and Keywords in the Header metadata

Semantic approach in creating expert summary

Focusing on semantic association of the tagline parts with respect to domain specific application and thus, improve extraction and ordering of parts for expertise

Topical classification of users

Leverage web search results & snippets as knowledge source for users as queries

32 User Tagline, Social-Informatics 2012

Page 33: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Acknowledgement

We thank our colleagues at Microsoft and Kno.e.sis for fruitful feedback and discussions

33 User Tagline, Social-Informatics 2012

Page 34: User Taglines: Alternative Presentations of Expertise and Interest in Social Media

Questions and thanks .. Thanks for hearing our presentation

For later questions?

[email protected]

[email protected]

Reference: Hemant Purohit, Alex Dow, Omar Alonso, Lei Duan, Kevin Haas. User Taglines:

Alternative Presentations of Expertise and Interest in Social Media. First ASE International Conference on Social Informatics (Social-Informatics 2012).

User Tagline, Social-Informatics 2012 34