humanizing the machine - by lukas biewald
TRANSCRIPT
About Lukas
THIS IS A TITLE
Lukas is the CEO at CrowdFlower.
Following his graduation from Stanford University with a B.S. in Mathematics and an M.S. in Computer Science, Lukas led the Search Relevance Team for Yahoo! Japan. He then worked as a senior data scientist on the Ranking and Management Team at Powerset, Inc., acquired by Microsoft in 2008.
@l2k
https://www.linkedin.com/in/lbiewald
The Effect of Better Algorithms
THIS IS A TITLE
Naïve Bayes Maximum Entropy
SVM0%
5%
10%
15%
20%
25%
Classifier Error Rate Real World Data
Active Semi-Supervised Learning for Improving Word Alignment
(Vamshi ACL ’10)
The Effect Of Better Features
Unigrams Bigrams Unigrams+Bigrams0%
5%
10%
15%
20%
25%
30%
Classifier Error Rate
The Effect Of Cleaner Data
THIS IS A TITLE
#RICHDATA
90% Accurate Data 95% Accurate Data 100% Accurate Data0%
2%
4%
6%
8%
10%
12%
14%
Classifier Error Rate
Where Do Data Scientists Spend Their Time?
Download CrowdFlower Data Science Report
2015
Source: CrowdFlower Data Science Report 2015
Use Case: Video Game Launch and Ongoing Sentiment Analysis• Ongoing analysis of online
conversation sentiment about client gaming product(s) and brand (across Twitter, Forums, Blogs, Facebook, etc.)
• Allows client to make decisions about product release dates, specifically feature releases
• Provides client with insights to prioritize issues, specifically surfaced around popular features that drive high volume of conversation and emotion within the gaming community
• Results in a satisfied/happy gaming community; turns detractors into promoters and improves
Use Case: Analyzing Purchase Intent (Auto Brand)
• Monitor consumer emotion to the purchase funnel by specific model:
o Awarenesso Considerationo Intent o Advocacy
• Analyze how marketing communications have shifted perceptions as it relates to the purchase funnel
• Identify which messages resonate with buyers and amplify positive purchase intent messages
Use Case: Journalist Outreach
• Analyzed journalist social content to understand which conversations they were receptive to, who influenced their feed, what publications they read, and how this related to specific topics of interest to our client’s brand
• Tailor their PR messaging to these journalists based on their content history