ibm advanced analytics platform for m&e
TRANSCRIPT
© 2013 IBM Corporation1
IBM Advanced Analytics Platform for M&E
Demand Forecasting: Predicting Movie Box Office
© 2013 IBM Corporation2
Current industry trends have raised the stakes for content companies to know and cater to our audiences
IBM Confidential
Online and social tools enable audiences to collaborate and influence a broader audience to drive consumption and revenue of content.
The era of ubiquitous multi-channel distribution to smart devices not only enables on-demand consumption but also provides a platform for new types of interactive content experiences.
With a proliferation of choices, consumers are in control of the "what and how much" they engage with content.
The need to capture, understand, and engage in the conversation with your audience.
Understand consumption patterns in order to monetize cross-platform behavior, and increase content engagement.
“Know your audience" to provide more differentiated & personalized content experiences.
$231 billion in revenue will be generated by the Connected Home by 2016, with provision of HD quality content and feature rich applications.–Connected Home Report
Consumer Power
A McKinsey report pegged the untapped business value of social technologies at $1.3 trillion
Digital Influence Ubiquitous Distribution
Tre
nd
Imp
lica
tion
50% of consumers watch video daily or weekly on digital devices; internet advertising revenues are growing. -IDC
Customer Insight Capability = Critical Enabler
© 2013 IBM Corporation3
IBM's customer insight solution is focused on delivering audience intelligence capabilities to enable the Media Enterprise business teams
IBM Confidential
Data Sources
IBM Advanced Analytics Platform for Customer Insight
Audience Profiling, Segmentation, & Targeting
Demand Forecasting
Marketing Campaign Effectiveness
Fan Engagement Scoring
Real-Time, Predictive, and Social Analytics
Linear Consumption
Nonlinear Consumption
1st party CRM
3rd Party CRM
Media
Marketing
Social Media
Today’s Discussion:
© 2013 IBM Corporation4
Through more accurate understanding of audience demand, businessteams can start to determine if particular actions need to be taken
The Problem: How do media companies evaluate demand for their content or services?
IBM Confidential
Identify Measurable Target
Outcomes/KPIs
Determine Audience Behavioral Proxies
Build Predictive Models/Demand
Scoring
Integrate Predictions with Business
Decisions
The Solution: IBM Demand Forecasting Real World Use Cases: Getting Early Actionable Indicators
Predicting Movie Opening Weekend Box Office: How do I know when to dial-up my marketing?
Forecasting Retail Demand for Packaged Media: How much should I sell-in to retailers to optimize sales?
Predicting Content Service Churn: When should I take action to prevent subscriber loss?
Demand Scoring for Content Archives: What content should I digitize and clear for licensing?
TV rating
2
3 4
Today’s Discussion:
© 2013 IBM Corporation5
Movie marketers most critical KPI is opening but have yet to find an approach to correlate audience behavior with box office outcome
IBM Confidential
A Nielsen causation study found that Tweets drive higher broadcast TV ratings for 48% of shows
A recent Google study found that “70% of the variation in box office performance can be explained with
movie-related search volume seven days prior to release date”
Websites like Fizziology provide live social media tracking, using Tweets to highlight movie box office success
21,000 Tweets 2,000,000+Tweets
vs.
Several websites provide traditional panel-based box office tracking, including: Hollywood Stock Exchange, Box Office Mojo, Rope of Silicon and Box
Will we hit our OWBO target? Do we need to dial up or change our
marketing effort?
8 weeks out 4 weeks out 2 weeks out
Teaser Trailers,Online Buzz
12 weeks out
Re-Messaging Campaign
Theatrical Cross Promotion
TV & Digital Marketing Campaign Start
PR, Talk Shows, & Final Push for TV/Digital Campaign
Post-opening weekendOpening
Weekend
OWBO $$$ Results
Movie Marketing Timeline:
Film tracking impacts ~ $900M for 2012’s top 100 movies “remaining” marketing spend
© 2013 IBM Corporation6
IBM engaged with a major movie studio to build a box office prediction model based on online audience behaviors
IBM Confidential
Evaluate models for accuracy
Train models based on data from 200+ movies
Collect data & determine predictive power
• Twitter Volume• Twitter Sentiment
Online presence
• # of Theatres• Movie Size• Genre
Movie Characteristics
• Studio• Seasonality • Rating
• FB Likes, New Likes• FB PTAT
• Rotten Tomato• Press Volume
Week 1 Model
Week 4 Model
Week 8 ModelIBM
Predictive Analytics
Is there a predictive relationship between social data & weekend box office?
Which variables seem to be the strongest predictors of weekend box office?
How accurately are we able to forecast box office? What types of movie have higher/lower forecast accuracy?
How can we improve our forecast accuracy?
© 2013 IBM Corporation7
There are relationships between social signals and box office sales; in
particular, Twitter volume and negative sentiment seem to have a strong
correlation with actual weekend box office results
Weekend Box Office Performance vs. Twitter Variables
Indexed Twitter VolumeIndexed Box Office PerformanceIndexed Twitter Negative Sentiment
Month
IBM Confidential
© 2013 IBM Corporation8
We achieved high levels of model fit and forecast accuracy achieved up to 8 weeks out where marketing campaigns can still be changed
Average % Prediction Error +/-25.8% +/-25.4% +/-25.7%
Average $ Prediction Error $5.2M $4.9M $5.3M
% Overpredicted Results 60% 60% 52%
Model vs. Forecast Accuracy over Release Period
Week 8 Model Week 4 Model Week 1 Model……
Opening Weekend
IBM Confidential
© 2013 IBM Corporation9
Week 1 Model Results
Model Predicted Box Office
Openin
g W
eekend B
ox
Off
ice
30% Error M
argin
Model Accuracy 88.4%
Forecast Accuracy 73%
Average % Prediction Error +/-25.7%
Average $ Prediction Error +/-$5.3M
% Overpredicted Results 52%
Ideal P
redict
ion
30%
Erro
r Mar
gin
Number of Predictions
Breakdown of % Prediction Error
Model Metrics Summary
Predicted Opening vs. Actual Opening
Relative Variable Significance
30% Error MarginUnderpredicted
Overpredicted
IBM Confidential
© 2013 IBM Corporation10
Benchmarking Prediction Error: Traditional Tracking vs. IBM Model
New ReleaseActual
Opening ($M)
Major US Studio BoxOffice.com LA Times IBM
$ Error (M) % Error $ Error (M) % Error $ Error (M) % Error $ Error (M) % Error
Fast and Furious 6 $97.0 -$32.0 -33% +$10.0 +10% +$3.0 +3% +$10.8 +11%
Hangover Part III $53.0 -$8.0 -15% +$16.0 +30% +$15.0 +28% +$4.8 +9%
After Earth $27.0 +$7.5 +28% +$9.0 +33% +$6.0 +22% +$2.7 +10%
Now You See Me $29.0 -$11.0 -38% -$6.0 -21% -$12.0 -41% -$0.2 -1%
The Internship $18.0 -$3.0 -17% +$3.0 +17% -$3.0 -17% +$0.1 +1%
The Purge $34.0 -$19.0 -56% -$18.0 -53% -$9.0 -26% +$2.1 +6%
Man of Steel $116.6 -$16.6 -14% -$1.6 -1% -$21.6 -19% -$3.6 -3%
Monsters University $82.4 +$4.6 +6% -$4.4 -5% -$2.4 -3% -$23.6 -29%
World War Z $66.0 -$13.5 -20% -$21.0 -32% -$11.0 -17% -$6.5 -10%
The Great Gatsby $50.1 N/A N/A -$5.1 -10% -$8.1 -16% +$3.0 +6%
Our approach resulted in the highest prediction accuracy vs. current industry benchmarks
Case in point: The IBM model gave the most accurate prediction compared to various industry tracking sources for 7 out of 10 recent releases (summer 2013)
Most Accurate Prediction
IBM Confidential
© 2013 IBM Corporation11
Action and animated films are the most accurately predicted film genres
Movie Genre by Prediction Error
Summary Stats by Movie GenreGenre % Accurate
Predictions% Average
Prediction Error$ Average
Prediction Error
Action 81% 20% 6.8M
Animated 76% 24% 5.1M
Comedy 71% 27% 4.3M
Drama/Romance 68% 34% 5.1M
Thriller/Horror 59% 27% 3.5M
Movie Genre distribution by Movie Size
Our model predicted XL and L movies very accurately.
Analysis of genre distribution by movie size revealed that XL and L movies have a high aggregate proportion of action plus animated movie releases, the two best predicted genres.
#
12
21
59
90
#
63
21
42
34
22
Action and animated releases have the lowest % error
Drama/Romance genre has the highest proportion of results with 50+% prediction error
IBM Confidential
© 2013 IBM Corporation12
Release % Accurate Predictions
% Average Prediction Error
$ Average Prediction Error
Fall 90% 16% 4.3M
Summer 83% 22% 6.9M
Holiday 71% 31% 3.5M
Spring 68% 27% 5.5M
Winter 61% 36% 4.4M
Late Summer 53% 29% 3.9M
Fall and summer release films are more accurately predicted compared to
other seasons and holiday releases
Release Period by Prediction Error
Summary Stats by Movie Release Period
Release Period Distribution by Movie Size
Our model predicted XL and L movies very accurately.
Analysis of release period distribution by movie size revealed that XL and L movies have a high aggregate proportion of summer plus fall movie releases, the two best predicted movie release periods.
#
21
48
17
63
18
15
#
12
21
59
90
No fall releases had 50+% prediction error
Fall and summer releases have the lowest % error
IBM Confidential
© 2013 IBM Corporation13
Movie Size % Accurate Predictions
% Average Prediction Error
$ Average Prediction Error
XL 100% 14% 18.3M
L 95% 10% 5.8M
M 76% 21% 5.4M
S 62% 34% 3.4M
L and XL films are very accurately predicted, whereas S and M films are
very inaccurately predicted
Movie Size % Error
ZOOKEEPER M 52
RESIDENT EVIL: RETRIBUTION M 53
LUCKY ONE M 54
WARM BODIES M 60
WAR HORSE S 67
DEAD MAN DOWN S 67
THE LAST STAND S 75
WHAT TO EXPECT WHEN YOU'RE EXPECTING S 76
THE LAST EXORCISM PART II S 76
MISSION IMPOSSIBLE: GHOST PROTOCOL S -77
A THOUSAND WORDS S 86
MAN ON A LEDGE S 86
PREMIUM RUSH S 87
PLAYING FOR KEEPS S 91
SAFE HAVEN M 95
BULLET TO THE HEAD S 109
BEAUTIFUL CREATURES S 112
MOVIE 43 S 140
MONSTERS INC 3D S -161
KP3D S 219
The worst 20 predictions all had 50+% prediction error and were only S or M size movies
Movie Size by Prediction Error
Since XL and L films are larger in revenue, the observed higher $ prediction error still translates to a lower % error.
Summary Stats by Movie Size
20 Worst Predicted Movies
#
12
21
59
90
Some S and M size movies had 50+% prediction errors
IBM Confidential
© 2013 IBM Corporation14
Since predictive modeling is an iterative process, our next step is to improve forecast accuracy
Case in point: We added Youtube Trailer Data to a subset of 74 movies. The trailer data added is the number of views for the top-viewed trailer for each movie, as found on a search on Youtube. The predictive accuracy is improved by adding this variable data by 13% more accurate predictions.
Week 1 Results without Trailer Data Week 1 Results with Trailer Data
Forecast Accuracy: 72% Forecast Accuracy: 85%
Hypothesis: We hypothesized that adding Youtube variable data could improve prediction accuracy.
Number of Predictions
30% Error Margin
Number of Predictions
30% Error Margin
IBM Confidential
© 2013 IBM Corporation15
Data Warehouse
Facebook: Time, Total Likes, New
Likes, PTAT (30 days)
Twitter: Volume, Sentiment
(30 days)
Movie Size
SPSS
UI Portal
Display
Widgets
Press Volume (30 days)
Rotten Tomatoes Score
# of Theaters
Unstructured
Structured
Our technical approach is to extract/integrate movie audience behaviors then build a predictive model to represent a target outcome
Genre
Studio
Release Period
Holiday Weekend
Rating
Data Visualization
Data
Query
PTA Model
IBM Confidential
Load & Cleansedata into tables for analysis
1
SPSS Auto Data Prep identifies the most important variables and transforms them to improve model accuracy
2
SPSS Auto Classifier builds the Ensemble Model for Per Theater Average Prediction, composed of
the average of the top 3 most accurate predictive algorithms,
resulting in improved accuracy overall
3
© 2013 IBM Corporation16
Intent to watch extracted from social buzz does not equate to positive
sentiment
IBM Confidential
Weeks before Opening Weekend
Extracted Intent to Watch for Life of Pi
“Really debating to skip this class to watch this movie #Argo”
Intent to watch a movie is extracted from Tweets like the following:
From the graph, we can see that the trend of intent to watch is not the same as the trend of positive sentiment
Weeks before Opening Weekend
Tracking the %Audience Intent by week for different movies could enable better prediction of movie relative performance
© 2013 IBM Corporation17
We see that a movie’s net sentiment polarity is correlated to its profitability
IBM Confidential
Key: Bubble Color = movie genre
$0 to 10M
Drama
Sentiment Polarity vs. Net Movie Profits
Estimated Net Profit ($M)
Pola
rity
of N
et
Sentim
ent (n
orm
aliz
ed)
Key: Bubble Size = Production Budget
Comedy
Thriller
Animated
Family/Romance
Family
Action/Drama
Action
Only negative sentiment
Only positive sentiment
$10+ to 35M
$35+ to 60M
$60+ to 100M
$100+ to 200M
$200+M
Romantic Comedy
Formula: Net Sentiment Polarity = Normalized(Positive Tweet Volume – Negative Tweet Volume) Formula: Net Profit = Gross Revenue – Production Budget – Marketing Budget (est. as ½ production budget)
© 2013 IBM Corporation18
Mapping differences in sentiment across geographical regions can enable
location-specific marketing campaigns
IBM Confidential
Argo
Ne
ga
tive
Sen
time
nt
Po
sitiv
e S
en
timen
t
Life of Pi
Target Area: Life of Pi received significantly more negative tweets in Mid-US and New-England
Argo Life of Pi
10-2525-5050-100100+
Scale: # of Tweets
<10
10-2525-5050-100100+
<10
Scale: # of Tweets
Target Areas: With geo-targeting we can identify areas that may have have less fan base either as having less positive sentiment or more negative sentiment.
Target Area: Life of Pi received significantly less positive sentiment in the Southeast and Maine
© 2013 IBM Corporation19
Our technical approach was to extract sentiment & intent as well as build audience segments attributes from millions of twitter postings
Create audience micro-segments sliced by
attribute data (intent, sentiment, CRM)
Create audience micro-segments sliced by
attribute data (intent, sentiment, CRM)
2
Streams Processing
Rules
Engine
Data Visualization
UI Portal
Display
Widget
UnstructuredUnstructured
Big Data Advanced
Analytics Warehouse
StructuredStructured
Extract intent to watch and
sentiment from social data
Extract intent to watch and
sentiment from social data
1
CRM data
Text
Analytics
Social Media
Apply context-based Entity analytics to match user profiles from varying data sources to create a single audience profile. Each instance of data associated with one user is assigned the same ID in the database to associate it to that profile.
Apply context-based Entity analytics to match user profiles from varying data sources to create a single audience profile. Each instance of data associated with one user is assigned the same ID in the database to associate it to that profile.
3
Entity Analytics
Individual Profiles
Intent
Sentiment
Micro-Segments
© 2013 IBM Corporation20
maturity
valu
e
Deliver Smarter
Customer Experiences
Real-Time Decisioning
Deliver customized interactions at the point of impact & consistent experiences across all channels
Uncover hidden patterns and associations within consumer data to predict what they are likely to do next
Analyze historical consumer purchase behavior, preferences, motivations and interactions
Capture and consolidate disparate data about consumers across touch points for 1 version of the truth
Information Integration
Where are you in the analytics journey?
Customer Insight
Personalized Communication
Understand the optimal offer, time and channel that is best for each individual consumer
Predictive Modeling
© 2013 IBM Corporation21
Big Data Videos: telling the analytics driven media story
From Audiences to Individuals: Delivering Smarter Customer Experiences
Enabling Marketers To Do More With Less Using Data Driven Ad Targeting
How Audience Measurement Is Changing The Model For Marketers & Advertisers
© 2013 IBM Corporation22
Thank you!
Connect with me:
@graemeknows
IBMBigDataHub.com
AnalyzingMedia.com