advanced analytics for social media research
Post on 14-Sep-2014
1.153 views
DESCRIPTION
Advanced Analytics for Social Media Research: Examples from the automotive industry (January 2013 Webinar)TRANSCRIPT
Please tweet! #RNWebinars @LoveStats
Advanced Analytics for Social Media Research:
Examples from the automotive industry
January 2013
Social media listening data by researchers, for researchers
Please tweet! #RNWebinars @LoveStats
Track brand mentions
Identify positive and negative brand attributes
Identify sources of negativity
Monitor an ad campaign
Measure category norms
Standard Social Media Research Uses
1
1
2
3
4
5
Please tweet! #RNWebinars @LoveStats
Correlations – How does gender correlate with brand choice? Which brands and features are preferred by men and by women?
Regression – Which features best predict purchase of specific brands? How do combinations of variables work together to predict an overarching variable?
Factor analysis – How do brands or features cluster together as being similar in consumer’s minds? What clusters “appear”? What is the best “package?”
Advanced Social Media Research Uses
2
1
2
3
Please tweet! #RNWebinars @LoveStats
Data + Category Experts = Insights
3
Expert methodologists collecting, cleaning, coding, and calibrating
data specific to your research objectives
Industry analysts using category and normative expertise to analyze and interpret data
Relevant, valid, and reliable conclusions, insights, and
recommendations
YourLogoHere
Please tweet! #RNWebinars @LoveStats
Research Method
4
Datasets
1. Branded: Random sample of verbatims mentioning a brand name (e.g., GMC, Honda, Lexus). To measure correlations.
• N>250 000
2. Branded purchasing: Random sample of verbatims mentioning a brand and purchase. To predict purchase. N>100 000
3. Branded pairs: Random sample of verbatims mentioning at least TWO brand names. To run brand factor analysis.
• N>100 000
Data Collection Criteria
• Consumer focus • Dealership messaging removed • Viral games and jokes removed
Collect
Clean
Categorize
Calibrate
• Clean out spam and non-relevant chatter (e.g., fun engagement conversations on Facebook)
• Scour the internet for thousands of messages related to the brand
• Categorize verbatims into relevant content areas, e.g., pricing, recommendations, commercials, celebrities
• Calibrate the sentiment into 5-point Likert scale buckets specific to the brand and category
Please tweet! #RNWebinars @LoveStats
What is a correlation?
5
A statistical process for identifying how two variables relate with each other.
• E.g., there exists a positive correlation between education and price paid for vehicles – Expensive cars tend to be owned by people with higher education
– Budget cars tend to be owned by people with lower education
– A correlation does not mean one variable causes the other. Sending an uneducated person to school will not cause them to buy an expensive car nor vice versa. The more likely scenario is that higher education leads to higher income which enables one to purchase a more expensive vehicle, if desired.
R=0.3 R=0.15
R=0.0
1
Please tweet! #RNWebinars @LoveStats
Correlations: Women’s Brand Preferences
6
Women are more likely than men to speak positively about midsize vehicles and base level SUVs.
Lexus (r=0.34) Nissan Pathfinder (r=0.34)
Nissan Maxima (r=0.31) Peugeot (r=0.28) BMW X5 (r=0.27)
Chevrolet Impala (r=0.25) Mitsubishi Eclipse (r=0.25)
e.g., 6% of the variance in positive opinions about Lexus can be attributed to gender (r=0.34)
Analysis: Gender must be specified (n=56 000), Brand non-mention treated as pair-wise missing, Minimum sample size per brand n>=30
Hi there! If you’re wondering what the little analysis box on the bottom means, your statistical analyst will be able to explain it to you. It’s just a quick technical summary of the steps taken to prepare these correlations.
Please tweet! #RNWebinars @LoveStats
Correlations: Men’s Brand Preferences
7
Men are more likely to speak positively about sporty cars and adventure trucks.
Jeep Safari (r=0.32) GMC Yukon (r=0.22) Ford Fiesta (r=0.17)
Mazda Miata (r=0.11) Toyota Tacoma (r=0.10) Ford Mustang (r=0.10)
e.g., 5.6% of the variance in positive opinions about Jeep Safari can be attributed to gender (r=0.32)
Analysis: Gender must be specified (n=56 000), Brand non-mention treated as pair-wise missing, Minimum sample size per brand n>=30
Please tweet! #RNWebinars @LoveStats
Correlations: Women’s Feature Preferences
8
Stereotypes abound as women chat more positively about easy driving (e.g., suspension) and appearance (e.g., dashboard) features.
Grill (r = 0.38) Suspension (r = 0.36) Dashboard (r = 0.35)
Interior (r = 0.33) Steering (r = 0.32)
(High correlation with
automatic transmission but sample size was only 17)
Analysis: Gender specified (n=56 000), Feature non-mention treated as pair-wise missing, Minimum sample size per feature n>=30
Please tweet! #RNWebinars @LoveStats
Correlations: Men’s Feature Preferences
9
Stereotypes continue as men chat positively about blasting their tunes (i.e. radio) and speeding (i.e. accelerator).
Car Radio (r=0.38) Accelerator (r=0.11) Headlight (r=0.10)
(High correlation with
manual transmission but sample size was only 25)
Analysis: Gender specified (n=56 000), Feature non-mention treated as pair-wise missing, Minimum sample size per feature n>=30
Please tweet! #RNWebinars @LoveStats
What is Regression?
10
A statistical method for estimating relationships among variables. To determine whether and by how much the change in the value of one variable affects the value of another variable.
Purchase 2 X
Variable A
1 X Variable
B
0.5 X Variable
C + = +
Can we determine which variables influence purchase opinions? • Is it a simple or complex relationship with few or many variables? • Do these relationships differ based on the brand? We can then focus our marketing attention in these areas with the appropriate
level of importance
2
Please tweet! #RNWebinars @LoveStats
Explaining Past Purchase
11
People who have purchased a vehicle focus on quality (e.g., servicing, errors), personality characteristics (e.g., honesty, pride), and features (e.g., color, size, fuel economy)
• Variables to account for 30% of variance: 17
• Variables to account for total variance (40%): 118
• Variables excluded from total : 200
• Key Variables: Color, Servicing, Errors, Functionality, Size, Recommend, Engine, Intelligence, Honesty, Pride, Fast, Fuel Economy, Ease, Doors, Wheels
Positive Purchase Opinion
Servicing X 0.12
Recommend X 0.11
Honesty X 0.08 + = +
Fuel Economy
X 0.08 +
Analysis: n>36 000, Exploratory stepwise, Feature non-mention recoded as neutral opinion, Subsample required mention of past purchase
Please tweet! #RNWebinars @LoveStats
Explaining Purchases of Jeep People who have purchased a Jeep talk more positively their vehicle being highly functional, requiring few repairs, and being sexy in appearance.
• Number of variables: 23
• % of Variance accounted for: 30%
• Positive Variables: Truck types, Functionality, Intelligence, Doors, Error, Size, Engine, Servicing, Tires, Repairs, Exciting, Wheels, Sexy, Transmission, Different
Positive Purchase Opinion
Types X 0.13
Doors X 0.11
Engine X 0.10 + = + Sexy X
0.07 +
Analysis: n>4600, Exploratory stepwise, Feature non-mention treated as neutral opinion, Subsample required mention of both purchase and Jeep brand
Please tweet! #RNWebinars @LoveStats
Explaining Women’s Purchases of Jeep Women who have purchased a Jeep talk more positively about their vehicle in terms of pride, reliability (e.g., errors, servicing), and appearance (e.g., hubcaps, fashionable)
• Number of variables: 15
• % of Variance accounted for: 27%
• Key Variables: Pride, Error, Truck Types, Size, Honesty, Cleanliness, Servicing, Doors, Brakes, Warranty, Hubcaps, Fashionable, Intelligence
Positive Purchase Opinion
Pride X 0.19
Error X 0.13
Honesty X 0.10 + = + Fashion
X 0.09 +
Analysis: n>460, Exploratory stepwise, Feature non-mention treated as neutral opinion, Subsample required mention of purchase, Jeep brand, and female author
Please tweet! #RNWebinars @LoveStats
What is Factor Analysis?
14
A statistic for determining which variables or brand names or product features are commonly associated with each other. The reader’s task is to determine why statistics put those items together and “name” the over-arching concept.
Medium
X-small
Small
X-large
Large
Polyester Velvet
Leather
Cotton
Nylon
Silk
What is Factor #1? Sizes What is Factor #2? Fabric
3
Please tweet! #RNWebinars @LoveStats
Factor Analysis Data
15
To run a factor analysis, each piece of data must incorporate at least two brand (or feature) mentions
• “In a few years, I want a red or black Range Rover and a sports car. Maybe a BMW or Mercedes.”
• “I need to know if I should get the 2 door bmw or 4 door mazda 3. Help me guys!”
• “Toyota Land Cruiser is way better than jeep in every way. With that price, it had better be.”
• “Would you buy a Mercury Mountaineer with lower miles or a Lexus with higher miles? Thanks for your help.”
Please tweet! #RNWebinars @LoveStats
How to Use Factor Analysis
16
• Identify the real competitive set, not what researchers or brand managers assume or assign
• Better understand consumer perceptions of your brand
• Discover new ways that consumers think about your brand
• Market against the most relevant competitors
Please tweet! #RNWebinars @LoveStats
Results: Automotive Brands Consumers categorize vehicles by size, adventurousness, and luxuriousness.
Ferrari, Porsche, Audi R8, BMW M3, Ford Mustang
Luxury
Chrysler,
Jeep, Dodge, Cherokee, Explorer, Mustang
Trucks
Peugeot, Kia, VW Golf,
Peugeot 206, VW Passat
Subcompact
Pontiac, Oldsmobile
Cutlass, Buick, Taurus
Midsize
Toyota Yaris, Prius, Kia,
Miata, Nissan Maxima
Fashionably Friendly
Your real competitors
How consumers categorize you
Analysis: n=75 000, Equimax rotation, Nonresponse recoded as neutral, Minimum sample size per brand n>=30, 11 factors based on scree plot
Please tweet! #RNWebinars @LoveStats
Results: Automotive Features Consumers categorize features into many buckets, some focused on the interior or exterior appearance, while others are focused on specific systems, such as fuel or drive system.
ABS, Traction
control, Airbags, Tire
Pressure
Safety
RWD, FWD, AWD, 4WD,
Turbo, Horsepower
Drive Systems
Fuel supply, Fuel tank, Air intake. Spark
plug
Fuel System
Black, White,
Red, Blue, Green, Pink,
Yellow
Colors
Hubcaps, Chrome, Bumper,
Grill, Headlight
Exterior Appearance
Dashboard, Beige, Pink, Mirrors, Cup
holder
Interior Appearance
Engine, Horsepower,
Turbo, Torque, Manual
Power
Hybrid,
Electric cars, Coupe, Fuel
economy
Fuel Economy
Analysis: n=100 000, Equimax rotation, Nonresponse coded as neutral, Minimum sample size per feature n>=30, 17 factors based on scree plot
Please tweet! #RNWebinars @LoveStats
What about conjoint?
19
Unfortunately, social media research is not ideal for running conjoint analyses. Surveys are much better suited to this need.
• Frequency of direct comparisons of one product feature in one social media sentiment: Extremely rare
• Ability to isolate two distinct opinions and apply the appropriate sentiment to each: Extremely difficult
“It pains me to see a price of $22k but if they offer $18k, I’ll take it.”
“I can’t afford $25k so I’m pumped for when the price comes down to $23k.”
Please tweet! #RNWebinars @LoveStats
Watchouts
20
Irrelevant data, spam, and viral jokes create false correlations between brands. If this data is not removed prior to the analysis, statistics will erroneously identify them as real associations.
• Irrelevant data
– Come test drive this 2010 Chevrolet Malibu LT. We also have the Impala, Toyota Camry, Honda Accord, Nissan Altima, and Ford Fusion.
• Spam
– free perscription volvo bieber gaga nike honda adidas free fedex saturday delivery toyota britney
• Viral Jokes
– Boyfriend: see that new, red mercedes benz parked beside our neighbour’s ferrari? Girlfriend: whoooa! its gorgeous! Boyfriend: yeah ... I bought you a toothbrush of that colour
!!
Please tweet! #RNWebinars @LoveStats
Thank you
21
www.conversition.com