the other 99% of a data science project
TRANSCRIPT
![Page 1: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/1.jpg)
THE OTHER 99% OF A DATA SCIENCE PROJECT
Open Data Science ConferenceSanta Clara | November 4-6th
2016Eugene Mandel
@eugmandel
![Page 2: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/2.jpg)
∎ @eugmandel∎ lead of data science at
directly∎ formerly:
□data science team at Jawbone
□co-founder qualaroo, jaxtr
ABOUT ME
![Page 3: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/3.jpg)
DATA SCIENCE NEEDS PRODUCT MANAGEMENTsuccess of a data science project has as much to do with product management as with data science
![Page 4: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/4.jpg)
2 KINDS OF DATA SCIENCE B
ANALYZE
A
BUILD
![Page 5: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/5.jpg)
PAYFORPARKINGWITHYOURPHONE
![Page 6: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/6.jpg)
DON’TYOUKNOWME?!
![Page 7: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/7.jpg)
∎ “don’t you know me?!” -> “you get me!”
∎ get smarter with every interaction
∎ reduce search space
SMART PRODUCTS
![Page 8: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/8.jpg)
SMART PRODUCTS
BUT NOT THAT SMART...
![Page 9: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/9.jpg)
SMARTPRODUCTSGOPROBABILISTIC
![Page 10: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/10.jpg)
THE OTHER 99% PERCENT
algorithms
![Page 11: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/11.jpg)
Show and explain your web, app or software projects using these gadget templates.
PARKING APP
ON DEMAND CUSTOMER SUPPORT
![Page 12: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/12.jpg)
LOOKING FOROPPORTUNITIES
![Page 13: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/13.jpg)
PROBLEM: choose support tickets that expert users can resolve
![Page 14: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/14.jpg)
LOOKING FOR OPPORTUNITIES
![Page 15: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/15.jpg)
CHOOSERESOLVABLETICKETSWITHMACHINELEARNING
![Page 16: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/16.jpg)
GETTING THE DATA
![Page 17: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/17.jpg)
GETTING ALLIES
![Page 18: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/18.jpg)
GETTING THE DATA
![Page 19: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/19.jpg)
CLEAN YOUR DATAAutomated bug reportsSurveysBounced emailsInternal ticketsEmail metadataEmail threads...
![Page 20: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/20.jpg)
GUYS CLEAN A DATASET, GET RICH
![Page 21: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/21.jpg)
FEATURE ENGINEERING
![Page 22: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/22.jpg)
TRAINING - COLD START PROBLEMall tickets
tickets seen by expert
![Page 23: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/23.jpg)
TRAINING -GET LABELS
“Is there a cat in this picture?” “Is this support ticket resolvable?”
![Page 24: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/24.jpg)
TRAINING -GET LABELS
∎ label manually∎ derive labels from user
behavior∎ derive labels from external
sources∎ mix
![Page 25: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/25.jpg)
My favorite data science algorithm is division.
Monica RogatiFormer VP of Data, Jawbone & LinkedIn data scientist
![Page 26: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/26.jpg)
TokenizationBag of words (BOW)Tf–idfRandom Forest Classifier
MODEL
![Page 27: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/27.jpg)
DEVELOPMENT
![Page 28: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/28.jpg)
PLAYING WELL WITH ENGINEERING
∎ gaining trust∎ development process
![Page 29: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/29.jpg)
POINTS OF INTEGRATION
online or offline?
![Page 30: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/30.jpg)
DEVELOPMENT
integration - broad APIs
![Page 31: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/31.jpg)
“NAPKIN ARCHITECTURE”
![Page 32: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/32.jpg)
IS IT WORKING? evaluatingdataproducts
Image source: https://themouseandthewindmill.wordpress.com
![Page 33: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/33.jpg)
accuracyprecision/recalldriven by business
EVALUATION METRICS
![Page 34: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/34.jpg)
IS IT WORKING? QA’ing dataproducts
Image source: https://themouseandthewindmill.wordpress.com
![Page 35: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/35.jpg)
PLAYING WELL WITH DEVOPS
![Page 36: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/36.jpg)
BRIDGING TECHSTACKS
![Page 37: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/37.jpg)
IN PRODUCTION
![Page 38: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/38.jpg)
THE KNOBS:HOW TO CONTROL THE PRODUCT
∎ on/off switch per customer∎ prediction threshold∎ exclusions
![Page 39: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/39.jpg)
“... SMART…”“... AI …”“...MACHINE LEARNING…”“...INTELLIGENT…”
NAMING THINGS
![Page 40: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/40.jpg)
UPDATING THE MODEL
∎ input data changes∎ users behaviour changes∎ dataset grows
![Page 41: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/41.jpg)
NEGATIVE SAMPLINGsend small % of predicted negativeas if they were positive
predicted positive
![Page 42: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/42.jpg)
NEGATIVE LABELINGsend small % of predicted negativefor manual labeling
predicted positive
![Page 43: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/43.jpg)
∎ “Would you be able to resolve this ticket successfully?”
∎ “Would an expert user be able to resolve this ticket successfully?”
∎ “Would an expert user be able to resolve this ticket successfully without getting a negative rating?”
LABELING - HOW TOPHRASE THE QUESTION?
![Page 44: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/44.jpg)
∎ customers∎ sales∎ account managers∎ marketing∎ execs
MESSAGING
![Page 45: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/45.jpg)
CUSTOMER ENGAGEMENT PLAYBOOK
![Page 46: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/46.jpg)
DATA ETHICS
![Page 47: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/47.jpg)
INTERPRETABILITY
Image source:https://en.wikipedia.org/wiki/File:Blue_Poles_(Jackson_Pollock_painting).jpg
![Page 48: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/48.jpg)
THANKS!Eugene Mandel@eugmandel
![Page 49: The Other 99% of a Data Science Project](https://reader035.vdocuments.us/reader035/viewer/2022062523/5871770a1a28ab230b8b52df/html5/thumbnails/49.jpg)
∎ Presentation template by SlidesCarnival∎ Images:
□ http://jedismedicine.blogspot.com/□ Jawbone□ Directly□ Wikipedia□ https://themouseandthewindmill.wordpress.com□ http://www.imdb.com/
CREDITS