intelligent applications with machine learning toolkits
TRANSCRIPT
11
Shawn Scully - VP of Customer Success & [email protected] @backwoodsbrains
Intelligent Applications with Machine Learning Toolkits
Within 5 years, every innovative application will be intelligent.
33
Intelligent applications create tremendous value
…but take a lot of time & specialized skills to build.
RecommendersLead Scoring
Churn Prediction
Multi-channel TargetingAuto-Summarization
Fraud detectionIntrusion Detection
Demand Forecasting
Data MatchingFailure Prediction
Our mission is to
Accelerate innovators to create intelligent applications with agile machine learning.
5
Needs of an Agile ML PlatformDato
Predictive Services
GraphLab Create
rapid development
deploy as microservice
live serving, monitoring, & model management
iterate w/feedback
A toolkit view of the world
77
Algorithms vs. toolkitsSVD++ w/SGD
vs.SVD
Recommender• item similarity• SVD++• iALS• factorization machine• many more!• PhD students care a lot about these!
• many papers focused on “my curve is better than your curve”
• Not always the most practical…
• Grouped by a common task• Focused on meaningful differences in data &
problem• Practical implementations
8
import graphlab as gl data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(
data,
user_id='user',
item_id='movie’,
target='rating') recommendations = model.recommend(k=5)
cluster = gl.deploy.load(‘s3://path’)cluster.add(‘servicename’, model)
Easily create a live machine learning service
Create a Recommender
5 lines of code
Toolkit w/auto selection
Deploy in minutes
99
Dato Machine Learning ToolkitsApplications• recommender• sentiment_analysis• similarity_search• churn_predictor• data_matching• lead_scoring• clickthrough_predictor
Fundamentals• regression• classifier• nearest_neighbors• clustering• deeplearning• anomaly_detection• pattern_mining• text_analytics• graph_analytics
Utilities• model_parameter_search• cross_validation• evaluation• comparison• feature_engineering
https://dato.com/products/create/docs/graphlab.toolkits.html
50+ models including factorization machines, convolutional neural nets, label propagation, & topic models all in one framework!
10
Toolkit: Recommender
1111
Examples of Recommenders
12
Recommend
Value: • Increase user engagement• Sell more/increase clickthrough• Create better user experiences
Goal: Find or recommend similar or related items.
1313
Recommend - Data + Toolkituser_id item_id item_name
103 1 ‘Empire Strikes Back’
102 2 ‘Wrath of Khan’
104 3 ‘Sleepless in Seattle’
102 4 ‘Rambo’
104 5 ‘Chocolate’
103 6 ‘The Avengers’
102 1 ‘Empire Strikes Back’
104 1 ‘Empire Strikes Back’
103 4 ‘Rambo’
104 7 ‘When Harry Met Sally’
102 2 ‘Wrath of Khan’
104 8 ‘Up’
recommendergraphlab.recommender.create
Toolkit: Sentiment Analysis & Product Sentiment
1515
Examples of sentiment scoring & summarization
16
Sentiment Analysis & Product Sentiment
Value: • Quantitative measures from unstructured text• Eliminate the need to read everything• Summarize on aspects you care about
Goal: Score sentiment of a sentence, document, or aspect.
1717
Sentiment scoring- Data + Toolkit
sentiment_analysisgraphlab.sentiment_analysis.creategraphlab.product_sentiment.create
Toolkit: Similarity Search
1919
Examples of image search & tagging
20
Image Search & Tagging
Value: • create more intuitive user experiences• learn interesting things like style• reduce manual processes (like tagging)
Goal: Find visually similar images.
2121
Image search - Data + Toolkit
similarity_searchgraphlab.data_matching.similarity_search.create
Toolkit: Churn Predictor
23
Churn Prediction
Value: • Keep your customers• Optimize marketing/customer success spend• Identify issues with product or business
Goal: Identify users that are likely to stop doing something(e.g. paying for your service, using a product feature, etc.)
Confidential - GraphLab internal use only
Problem setup
Period 1
Period 2
Period 3
Features Target
Hold out set
Goal: model that predicts if a user does not appear in Period 2Evaluation: score for (app, user) pairs absent in Period 3 Machine
learningmodel
Evaluation
25
Data Transformations
Time Uniquepairs
app user time etc app user feature1
feature2
Features:● time since last use● time since first use● # unique days user has used app● # times user used app in last delta days● Rolling aggregates● etc
Aggregate to generate predictive featuresopens
2626
Predict Churn - Data + Toolkituser_id event datetimestamp
103 play ‘01-01-15’
102 click ’02-05-15’
102 visit ‘03-06-15’
102 visit ’03-09-15’
103 purchase ’03-21-15’
103 click ’03-22-15’
102 click ’03-23-15’
103 click ’04-02-15’
103 play ‘04-01-15’
103 purchase ’05-02-15’
103 play ‘05-01-15’
103 play ’05-15-15’
churn_predictorgraphlab.churn_predictor.create
27
Toolkit: Data Matching
2828
Examples of data matching
record= {‘SSN’:None, ‘Name’:’Smith, Will’ ‘DOB’:1973.01.02, ‘Sex’:’Male’, ‘ZIP;:94701}
29
Data Matching
Value: • Deduplicate contacts/records• “360 view” of customer across multiple properties• Improve data quality
Goal: Identify entities & appropriately link records.
3030
Data matching – Data + Toolkit
data_matchinggraphlab.deduplication.creategraphlab.record_linker.create
31
More than 50,000 developers are using Dato
3232
Dato Confidential - Do not Distribute
Tools built for innovators
The Agile Machine Learning Platform
34
Agility to create machine learning services
GraphLab Create Application Toolkits:
• Auto-select the best algorithm• Auto-prepare the data for ML• Task-oriented methods
Data Layer for ML• Manipulate all-relevant data types• Out-of-core design eliminates scale
pains
Robust Enterprise-Grade Algorithms• 50+ of best-practice & novel
algorithms• Robust to real-world data
3535
Dato Predictive ServicesReal-time RecommendationsOnline Ad Scoring & ServingTransactional Fraud detection
Agility to deploy – Microservices on AWS, premises, Yarn
How will you make your enterprise intelligent?
37
Thanks!
get the software!: https://www.dato.com/download/
platform overview: https://dato.com/products/
talk about ML at your company: [email protected]
Toolkits: overview:https://dato.com/products/create/docs/graphlab.toolkits.htmlrecommender: https://dato.com/products/create/docs/graphlab.toolkits.recommender.htmlchurn_predictor: https://dato.com/products/create/docs/graphlab.toolkits.churn_predictor.htmlsimilarity_search: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html#similarity-search-modelsentiment_analysis: https://dato.com/products/create/docs/graphlab.toolkits.sentiment_analysis.htmldata_matching: https://dato.com/products/create/docs/graphlab.toolkits.data_matching.html