Our Twitter Profiles, Our Selves: Predicting Personality with Twitter
Daniele Quercia, Michal Kosinski, David Stillwell, Jon Crowcroft
COMP4332Wong Po Yan
Introduction
▪ Significant correlation between personality and real-world behavior–Music taste–Formation of social relations
▪ Predicting the personality of users in Twitter
Why Twitter?
▪ Previous study on Facebook–The nature of online interactions does not
significantly differ from that of real world interactions
▪ A different platform–See anything of anybody unless users protect their
updates
▪ Popular
Twitter Users▪ Four types with Five measures– Listeners : follow many users–Popular: are followed by many–Highly-read: are often listed in other’s reading list– Influential: ▪ Klout score
Whether a user’s tweet is being clicked, replied or retweeted▪ TIME score
TIME magazine ranking measure that combines one’s popularity on both Tweeter and Facebook using the formula
(2a + b) / 2, where a = number of Twitter followers,b = number of Facebook social contact
Personality
▪ The Big Five Personality Test–An individual is associated with fives scores that
correspond to the five main personality traits
▪ Traits–Openness–Conscientiousness–Extraversion–Agreeableness–Neuroticism
myPersonality
▪ Facebook users are able to take a variety of personality and ability test
▪ Users can give consent to share their personality scores and profile information–40%–Only few hundreds of those have posted links to
their Twitter accounts.
▪ The Big Five Personality Test
Goal
Relationship between
▪ Personality Traits (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism)
▪ Two additional attributes (age, sex)
And
▪ Five user characteristics (followings, followers, listings, influential score (Klout, TIME))
Data Collection
▪ Sample users: 335– Have specified their twitter accounts on Facebook profile– Have done the Big Five Personality Test using
myPersonality in Facebook– Have shared the results and profiles on Twitter
▪ Data– Number of followers– Number of following – Number of times that the user has been listed in others’
reading list
Data Processing:Logarithm
▪ Number of followed users
▪ Number of followers
▪ Listings
▪ Two influential scores (Klout, TIME)
▪ AgeWhy?
▪ Corresponding distributions are not normal
▪ Logarithm transformation accounts for the violation of normality
Pearson Product Moment Correlation
▪ A measure of the linear relationship between two random variables
▪ Formula
▪ Range: [-1, 1]
Results
Listener & Popular▪ Extraversions– 0.13 for Listener– 0.15 for Popular– Extroverts
▪ Neuroticism– -0.17 for Listener– -0.19 for Popular– Emotionally stable
▪ Age– 0.28 for Listener– 0.37 for Popular– Tend to be older
Listener and Popular are extroverts and emotionally stable. They tend to be older.
Results
Highly-read–Openness▪ 0.17
Highly-read are people who are imaginative, spontaneous and adventurous.
Results
Influential
▪ Klout– Extraversion: 0.15– Neuroticism: -0.03
▪ TIME– Conscientiousness: 0.18– Extraversion: 0.25– Neuroticism: -0.20– Age: 0.39
Influential are people who are extroverts, emotionally stable, ambitious and resourceful. They are very likely to be older.
Model for Prediction▪ Regression analysis
▪ 10-fold cross validation using M5’ Rule– M5’ is based closely on M5– M5 (Model tree) combines a conventional decision tree with
the possibility of linear regression functions at the leaves– M5’ is the enhanced algorithm that improves with handling
missing values and enumerated attributes
▪ Root Mean Square Error–Compare the difference between predicted values and
observed values–On score scale [1,5], maximum RMSE = 0.88–Error is low Accurate
Conclusion
▪ All user types are emotionally stable
▪ Most of the users are extroverts, except Highly-read people
▪ Listener, Popular and Influential people tend to be older
▪ Influential people tend to be ambitious, but seem to be not very agreeable
▪ Highly-read people tend to be adventurous and imaginative
These inferences have long been supported informally by intuition but have been difficult to make it precise.
Suggestions
▪ Marketing–Marketing strategy is closely related to consumer personality–E.g. Select ads to which the user is likely to be most receptive
▪ User Interface Design–Match the “look and feel” of a social media site to personality
traits
▪ Recommender Systems–Product recommendation–E.g. Recommend music to users under given well-established
relationship between personality and music taste