evan estola, lead machine learning engineer, meetup at mlconf sea - 5/20/16
TRANSCRIPT
![Page 1: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/1.jpg)
When Recommendation Systems Go Bad
Evan Estola5/20/16
![Page 3: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/3.jpg)
![Page 4: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/4.jpg)
![Page 5: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/5.jpg)
![Page 6: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/6.jpg)
![Page 7: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/7.jpg)
We want a world full of real, local community.Women’s Veterans Meetup, San Antonio, TX
![Page 8: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/8.jpg)
![Page 9: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/9.jpg)
![Page 10: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/10.jpg)
![Page 11: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/11.jpg)
![Page 12: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/12.jpg)
Recommendation Systems: Collaborative Filtering
![Page 13: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/13.jpg)
![Page 14: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/14.jpg)
Recommendation Systems: Rating Prediction
Netflix prizeHow many stars would user X give movie YBoring
![Page 15: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/15.jpg)
Recommendation Systems: Learning To Rank
Active area of researchUse ML model to solve a ranking problemPointwise: Logistic Regression on binary label, use output for
rankingListwise: Optimize entire listPerformance Metrics
Mean Average PrecisionP@KDiscounted Cumulative Gain
![Page 16: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/16.jpg)
![Page 17: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/17.jpg)
Data Science impacts
lives
Ads you seeApps you downloadFriend’s Activity/Facebook feedNews you’re exposed toIf a product is availableIf you can get a ridePrice you pay for thingsAdmittance into collegeJob openings you find/getIf you can get a loan
![Page 18: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/18.jpg)
![Page 19: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/19.jpg)
You just wanted a kitchen scale, now Amazon thinks you’re a drug dealer
![Page 20: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/20.jpg)
![Page 21: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/21.jpg)
![Page 22: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/22.jpg)
Ego
Member/customer/user firstFocus on building the best product,
not on being the most clever data scientist
Much harder to spin a positive user story than a story about how smart you are
![Page 23: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/23.jpg)
“Black-sounding” names 25% more likely to be served ad suggesting criminal record
![Page 24: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/24.jpg)
Ethics
We have accepted that Machine Learning can seem creepy, how do we prevent it from becoming immoral?
We have an ethical obligation to not teach machines to be prejudiced.
![Page 25: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/25.jpg)
Data Ethics
Awareness
Tell your friends Tell your coworkersTell your boss
Identify groups that could be negatively impacted by your work
Make a choiceTake a stand
![Page 26: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/26.jpg)
![Page 27: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/27.jpg)
Interpretable Models
For simple problems, simple solutions are often worth a small concession in performance
Inspectable models make it easier to debug problems in data collection, feature engineering etc.
Only include features that work the way you want
Don’t include feature interactions that you don’t want
![Page 28: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/28.jpg)
Logistic Regression
StraightDistanceFeature(-0.0311f),ChapterZipScore(0.0250f),RsvpCountFeature(0.0207f),AgeUnmatchFeature(-1.5876f),GenderUnmatchFeature(-3.0459f),StateMatchFeature(0.4931f),CountryMatchFeature(0.5735f),FacebookFriendsFeature(1.9617f),SecondDegreeFacebookFriendsFeature(0.1594f),ApproxAgeUnmatchFeature(-0.2986f),SensitiveUnmatchFeature(-0.1937f),KeywordTopicScoreFeatureNoSuppressed(4.2432f),TopicScoreBucketFeatureNoSuppressed(1.4469f,0.257f,10f),TopicScoreBucketFeatureSuppressed(0.2595f,0.099f,10f),ExtendedTopicsBucketFeatureNoSuppressed(1.6203f,1.091f,10f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.1702f,0.252f,0.641f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.4983f,0.641f,10f),DoneChapterTopicsFeatureNoSuppressed(3.3367f)
![Page 29: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/29.jpg)
Feature Engineering and Interactions
● Good Feature:○ Join! You’re interested in Tech x Meetup is about Tech
● Good Feature: ○ Don’t join! Group is intended only for Women x You are a Man
● Bad Feature:○ Don’t join! Group is mostly Men x You are a Woman
● Horrible Feature:○ Don’t join! Meetup is about Tech x You are a Woman
Meetup is not interested in propagating gender stereotypes
![Page 30: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/30.jpg)
Ensemble Models and
Data segregation
Ensemble Models: Combine outputs of several classifiers for increased accuracy
If you have features that are useful but you’re worried about interaction (and your model does it automatically) use ensemble modeling to restrict the features to separate models.
![Page 31: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/31.jpg)
Ensemble Model, Data Segregation
Data:*InterestsSearchesFriendsLocation
Data:*GenderFriendsLocation
Data:Model1 PredictionModel2 Prediction
Model1 Prediction
Model2 Prediction
Final Prediction
![Page 32: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/32.jpg)
Fake profiles, track adsCareer coaching for “200k+”
Executive jobs AdMale group: 1852 impressionsFemale group: 318
![Page 33: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/33.jpg)
Diversity Controlled Testing
CMU - AdFisherCrawls ads with simulated user profiles
Same technique can work to find bias in your own models!Generate Test Data
Randomize sensitive feature in real data setRun Model
Evaluate for unacceptable biased treatmentMust identify what features are sensitive and what outcomes are
unwanted
![Page 34: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/34.jpg)
![Page 35: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/35.jpg)
● Twitter bot● “Garbage in,
garbage out”● Responsibility?
“In the span of 15 hours Tay referred to feminism as a "cult" and a "cancer," as well as noting "gender equality = feminism" and "i love feminism now." Tweeting "Bruce Jenner" at the bot got similar mixed response, ranging from "caitlyn jenner is a hero & is a stunning, beautiful woman!" to the transphobic "caitlyn jenner isn't a real woman yet she won woman of the year?"”
Tay.ai
![Page 36: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/36.jpg)
Diverse test data
Outliers can matterThe real world is messySome people will mess with youSome people look/act different than
you
DefenseDiversityDesign
![Page 37: Evan Estola, Lead Machine Learning Engineer, Meetup at MLconf SEA - 5/20/16](https://reader035.vdocuments.us/reader035/viewer/2022062502/58a517611a28ab8e1c8b6c93/html5/thumbnails/37.jpg)
You know racist computers are a bad idea
Don’t let your company invent racist computers