user taglines: alternative presentations of expertise and interest in social media
DESCRIPTION
Web applications are increasingly showing recommended users from social media along with some descriptions, an attempt to show relevancy - why they are being shown. For example, Twitter search for a topical keyword shows expert twitterers on the side for 'whom to follow'. Google+ and Facebook also recommend users to follow or add to friend circle. Popular Internet newspaper- The Huffington Post shows Twitter influencers/ experts on the side of an article for authoritative relevant tweets. The state of the art shows user profile bios as summary for Twitter experts, but it has issues with length constraint imposed by user interface (UI) design, missing bio and sometimes funny profile bio. Alternatively, applications can use human generated user summary, but it will not scale. Therefore, we study the problem of automatic generation of informative expertise summary or taglines for Twitter experts in space constraint imposed by UI design. We propose three methods for expertise summary generation- Occupation-Pattern based, Link-Triangulation based and User-Classification based, with use of knowledge-enhanced computing approaches. We also propose methods for final summary selection for users with multiple candidates of generated summaries. We evaluate the proposed approaches by user-study using a number of experiments. Our results show promising quality of 92.8% good summaries with majority agreement in the best case and 70% with majority agreement in the worst case. Our approaches also outperform the state of the art up to 88%. This study has implications in the area of expert profiling, user presentation and application design for engaging user experience. Find paper here: http://knoesis.org/library/resource.php?id=1793TRANSCRIPT
ASE Social-Informatics 2012, Dec 16
Hemant Purohit1 Alex Dow, Omar Alonso, Lei Duan, Kevin Haas
1Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State , USA
Microsoft Silicon Valley, USA
“How do we know: Why are you influential?”
Outline Motivation
Problem Settings
Proposed Approach
Experiments & Results
Future Work
2 User Tagline, Social-Informatics 2012
Motivation
3 User Tagline, Social-Informatics 2012
?
HuffingtonPost
• Increasing use of recommended influencers and experts from social media.
• No relevance cues about who they are and why they are experts
• Need presentation of their expertise or interest areas
• Help user engagement with the content
Motivation
4
• I can see why …
User Tagline, Social-Informatics 2012
Senior Media Reporter for The Huffington Post
Better engagement with content!
User Data Mining
Analysis & Aggregation
Tagline
Problem Statement Scope:
Twitter users with English language profile descriptions and tweets
Specific problem:
Given a set of N experts E= {ei | i = 1, 2, . . . , N} in a social media community C of K (K > N) users, generate a short summary di within predefined length for each of the expert ei in the set E.
User Tagline, Social-Informatics 2012 5
Challenges How to extract information from user’s Twitter profile bio
that well presents the author’s expertise?
Constraint space of UI design when displaying it.
User Tagline, Social-Informatics 2012 6
Meteorologist at TWC in Atlanta . Hike, sail, snowboard, travel, take pictures and glue stuff.
Challenges How to generate tagline when user’s bio doesn’t
contain sufficient information?
Some profile bios are humorous.
Sometimes profile bio is simply missing
User Tagline, Social-Informatics 2012 7
Challenges In difference with web search document
summarization or snippet generation
Very short text segment vs. full page of web documents
People centric vs. document centric
More constraints on space due to UI design
Lack of or missing information
User Tagline, Social-Informatics 2012 8
Proposed Approach
Data Sampling
Candidate Taglines
Generation
Final Tagline Selection
9 User Tagline, Social-Informatics 2012
What Data to Exploit?
Meformer
Self Descriptive
e.g., Profile Bio, Content of the user posts
Informer
Others describe the person
e.g., Wikipedia, IMDB, Freebase etc.
10 User Tagline, Social-Informatics 2012
Candidate Taglines Generation
11 User Tagline, Social-Informatics 2012
1. Occupation-Patterns
2. Link-Triangulation
3. User Classification
1. Occupation-Pattern Approach
12
west coast editor for @thenextweb. geek. world changer. social good creator. community lover. #BlameDrewsCancer'er. #ihaveadream. yep, that drew.
Occupation well-represents user’s expertise areas.
Ufc Octagon Girl,Ultimate Insider Host. Model, Actress, Bud Light Spokesmodel and Fitness Nerd. I'm Not that cool, Lover not a fighter.
president and editor-in-chief of the huffington post media group. mother. sister. flat shoe advocate. sleep evangelist.
User Tagline, Social-Informatics 2012
Extracting Occupation Patterns Occupation titles collection
from Wikipedia and
US Dept. of Labor Statistics Reports
Given a user’s profile bio Spot the occupation titles in the N-grams of the bio
Create candidate tagline by joining N-grams in the order of their appearance in profile bio (Intuition: User’s
own described order of what he is expert for!)
Create final candidate set by those falling in the UI space constraint threshold.
User Tagline, Social-Informatics 2012 13
What if there is no occupational pattern?
Can we exploit informer data?
Expert users usually have Wikipedia pages where people have written about them
Connect Twitter expert users with their Wikipedia pages.
Use Wikipedia informative summary as expert users tagline
14 User Tagline, Social-Informatics 2012
User Wikipedia page’s Info-box
Twitter User
Personal Website: www.rachaelray.com
2. Link-Triangulation Approach
15 User Tagline, Social-Informatics 2012
User Identity
Resolution by cross
referenced links
Extracting User Description After resolving user identity corresponding to a
Knowledge-base (KB) page, exploit content of KB
Wikipedia Infobox ‘occupation’ property e.g., for celebrity TV host ‘Rachael Ray’:
Television Personality, businesswoman, author, celebrity chef
Apply restriction by UI space constraint
User Tagline, Social-Informatics 2012 16
Link-Triangulation Examples USER BIO WIKIPEDIA BASED TAGLINE
Tommy Chong
The legend Tommy Chong from the iconic comedy duo Cheech and Chong. Currently on the 'Get it legal' North
American tour. actor, comedian, musician
Drake Bell Thanks for following me, guys! actor, comedian, producer, musician,
singer-songwriter, director, voice artist
Steve Harvey
Proud husband, father, comedian, philanthropist and author. God has blessed me with more than I could have asked for. New Daytime TV Show premiering September
4 actor, comedian, radio and television
personality, author
bob saget
I like long walks on the beach...and then to be dragged through the sand by an off-road vehicle, and then hurled
off a catapult. actor, comedian, television host,
director, screenwriter
17 User Tagline, Social-Informatics 2012
What happens if … No occupational-patterns can be extracted
No links or can’t resolve links to knowledge base
Can we classify users to categories based on their
Popularity
Activity
Content diffusion strength?
Solution: Use the classes as default taglines if nothing else
User Tagline, Social-Informatics 2012 18
3. User Classes Approach
19
USER ACTIVITY
USER POPULARITY
DIFFUSION STRENGTH
Casual Celebrity
Information Hub
Passive Celebrity
Active Celebrity
Authority
Casual Twitterer Conversationalist
Disengaged Celebrity
User Tagline, Social-Informatics 2012
User Classification Metrics Popularity, Activity and Diffusion Strength can be
modeled in different ways
Our formulation, for given dataset period: Popularity- #user’s mentions
Activity- #statuses
Diffusion Strength- #retweets of user tweets
50% percentile on each metrics (max normalized) as classifier to create various classes in the 3-D model
User Tagline, Social-Informatics 2012 20
User Classes Examples TWITTER SCREEN-NAME
DISPLAY-NAME BIO PROFILE LINK USER-CLASS TAGLINE
Sullivan26 Steve
Sullivan EMPTY Unmatched with Wiki
Steve Sullivan is Casual Twitterer.
iamsrk
SHAH RUKH KHAN EMPTY Unmatched with Wiki
SRK is Active Celebrity on Twitter.
21 User Tagline, Social-Informatics 2012
Proposed Approach
Data Sampling
Candidate Taglines
Generation
Final Tagline Selection
22 User Tagline, Social-Informatics 2012
Principles for Quality Assessment of a Tagline Readability
‘host of Man Vs Wild & Born Survivor, family man and host of Man Vs Wild’ vs. ‘chief scout (the scout association), adventurer, explorer, author, motivational speaking, television presenter’
Specificity ‘Actor, Director’ vs. ‘Actor in Forest Gump’
Interestingness ‘CEO of Amazon, Co-founder of ABC Corp., Actor, Writer’
23 User Tagline, Social-Informatics 2012
Readability
Observations
Poor scores for fragmented sentences
Not suitable for measuring taglines
Skipped for the experimental stats
24
Flesch Readability Ease Scoring [0-100] - Higher the better
CANDIDATE TAGLINE-1 SCORE CANDIDATE TAGLINE-2 SCORE
singer from Norway 62.7 singer, actor, songwriter, composer, pianist 0
songwriter from Glasgow 34.5 singer, songwriter, musician 0
host of Man Vs Wild & Born Survivor, family man and host of Man Vs Wild 84.4
chief scout (the scout association), adventurer, explorer, author, motivational speaking, television presenter 0
User Tagline, Social-Informatics 2012
Specificity & Interestingness
• Difficult to formulate specificity and interestingness of a tagline because of subjectivity
• Modified Tf-idf approach: • query=user, relevant-docs-to-user= candidate taglines set
• Document set as all the candidate taglines of all the users
• Normalize score for a document (tagline) by its character length, w.r.t. maximum UI space constraint T (to boost the space utilization)
• Rank candidate taglines scores for a user and then select final tagline
User Tagline, Social-Informatics 2012 25
Specificity & Interestingness
26
Examples:
TAGLINE-1 SCORE-1 TAGLINE-2 SCORE-2 FINAL-TAGLINE
singer from Norway 0.791859 singer, actor, songwriter,
composer, pianist 1.221946 singer, actor, songwriter,
composer, pianist
songwriter from Glasgow 1.03031 singer, songwriter, musician 0.496351 songwriter from Glasgow
Ultimate Insider Host, Model, Bud Light Spokesmodel 3.179482 model, ufc octagon girl 1.429542
Ultimate Insider Host, Model, Bud Light
Spokesmodel
host of Man Vs Wild & Born Survivor 2.109237
chief scout (the scout association), adventurer,
explorer, author 4.834862
chief scout (the scout association), adventurer,
explorer, author Multi Grammy Award-winning
singer 1.423924 singer, actress, comedian,
author, producer 0.897228 Multi Grammy Award-
winning singer
User Tagline, Social-Informatics 2012
Evaluation Baseline: Full Twitter profile bio as a tagline
User study
Questions for user study of candidate tagline generation
How good is the generated summary of expertise presentation?
How good the original bio is by itself?
27 User Tagline, Social-Informatics 2012
Evaluation (contd.) Metrics for evaluation
Majority agreement on good candidate taglines: Percentage of total samples, where at least two third of judges give scores as either 1 (good) or 2 (very good) for the question Q1
Inter-rater agreement (Fleiss Kappa)
User study for final tagline selection
Q. Which one is the best summary to represent expertise description between two given candidate summaries?
Metric: % of agreeable summaries between human and automatic judgment
User Tagline, Social-Informatics 2012 28
Experimental Data Sets • Crawled Twitter datasets from different time periods: tweets and users
• Expert* Twitter Users set: Top 30% ranked users based on Klout score in the dataset collected
*Note: that any Expert Finding Algorithm or service can be used, our approach is independent on that.
User Tagline, Social-Informatics 2012 29
Table I. STATISTICS OF EXPERT USERS IN THREE DATA SETS
User Study Results For candidate tagline generation:
*for Link-Triangulation approach we did not have a good baseline than the user profile bio
For final tagline selection:
Majority Agreement for final tagline choice between automatic and human judge selection: 65.15%
User Tagline, Social-Informatics 2012 30
Majority Agreement for good taglines
Baseline Phase-1 datasets
Phase-2 datasets
Occupation-Pattern approach 30% 74.5% 70.31%
Link-Triangulation approach *30% 91.6% 92.82%
Conclusion The contribution of this work
Proposed the practical approach to represent user expertise on social media as taglines given constraints of UI design.
Addressed important information extraction from user profile bio by applying occupational-pattern approach.
Addressed quality and coverage issue of tagline generation by link-triangulation approach leveraging knowledge base such as Wikipedia.
Explored user classification approach for generating taglines for users without profile bios.
Experiment results show significant improvement over baseline of the state-of-art.
User Tagline, Social-Informatics 2012 31
Future Work Exploit user’s web-page
Title and Keywords in the Header metadata
Semantic approach in creating expert summary
Focusing on semantic association of the tagline parts with respect to domain specific application and thus, improve extraction and ordering of parts for expertise
Topical classification of users
Leverage web search results & snippets as knowledge source for users as queries
32 User Tagline, Social-Informatics 2012
Acknowledgement
We thank our colleagues at Microsoft and Kno.e.sis for fruitful feedback and discussions
33 User Tagline, Social-Informatics 2012
Questions and thanks .. Thanks for hearing our presentation
For later questions?
Reference: Hemant Purohit, Alex Dow, Omar Alonso, Lei Duan, Kevin Haas. User Taglines:
Alternative Presentations of Expertise and Interest in Social Media. First ASE International Conference on Social Informatics (Social-Informatics 2012).
User Tagline, Social-Informatics 2012 34