data science and its relevance in product management

26
Data Science and its relevance in Product Management - Amit Sharma Director Architecture and Data Science July 29, 2016 Copyright © 2015 ADP, LLC. Proprietary and Confidential.

Upload: amit-sharma

Post on 16-Apr-2017

105 views

Category:

Leadership & Management


1 download

TRANSCRIPT

Data Science and its relevance in Product Management

- Amit Sharma

Director Architecture and Data Science

July 29, 2016

Copyright © 2015 ADP, LLC. Proprietary and Confidential.

2

About ADP

•  Human Capital Management

•  Small, Midsized, Large, Multinational markets •  Revenues: $10.9 billion in fiscal 2015 •  ADP pays 24 million (1 in 6) workers in U.S., and 12 million elsewhere •  FORTUNE 500®: Ranked 251 (2015) •  Forbes® Global 2000: Ranked 370 (2015)

Payroll   Tax   Time   HR   Talent   Benefits  

3

Agenda

•  Data Science applications in ADP •  Applying Data Science in Product Management •  Applying Product Management in Data Science •  Exercise

4

Workforce Analytics – Current State

5

Who uses Workforce Analytics

6

Evolution of ADP Data Cloud

7

Data Science in ADP

Classifica6on   Predic6ve   Op6miza6on   Text  Mining  

Recommenders   Social   User  Behavior  

8

Challenges

9

Classification

•  Used for standardization across clients for Benchmarking –  Job Titles, Department Names, Business Functions –  Pay codes, Termination reasons

•  Legacy Text mining/clustering techniques limitations –  Variety of data is the challenge due to customizations –  Short String documents, highly abbreviated

•  Lack of standards –  Opportunity to define industry standards

10

Predictive analytics

•  Employee Turnover Probability –  What is the risk of an Employee leaving in next quarter

•  Workforce Capacity Planning –  What is the risk of an Employee taking an unplanned leave

11

Optimization

•  Merit Increase Guidelines in compensation management •  Linear Programming problem •  Objective

–  Best utilization of budget amount to reward employees •  Constraints

–  Respect budget –  Stick to Compensation philosophy –  Handle Compa-ratios –  Minimum and Maximum increment limits

12

Text Mining and Analytics •  Natural Language Processing

–  Semantic Search

•  Product Management decisions –  Analyze Customer feedback based on call logs (Voice-of-Customer) –  Product Roadmap and MVP decisions

•  Which features should be picked up first –  UX design

•  Frequent pattern mining to identify which fields are used together –  Sentiment Analysis

•  Clickstreams –  Google Analytics for usage monitoring –  Success of a new rollout

•  Is the feature being used correctly

13

Social Network Analysis

•  Network chart based on communication proximity •  Leadership Radar •  Performance Management

14

Data science in Product Management

•  Defining the product Roadmap –  Call Log Analytics

•  Designing the UX –  Navigation (Applicant dropout Analysis) –  Unused Fields –  Default values for fields –  Grouping fields and conditional logic –  Template definition

15

Scanning through data across sources

16

UX Design •  Identify fields and scenarios •  Identify common values and co-occurrence patterns •  Redefine UI to leverage the patterns

–  Create Templates –  Remove/Group/Default fields

Scenario  #   Field1   Field2   Field3   Field1  +  Field2   Field2+Field3   Field1+Field3   Field1+2+3  Scenario  1   Empty   Default   Empty   10%  -­‐   70%  -­‐  Scenario  2   Filled  (90%)   Default   Filled  (90%)   80%   10%   10%   5%  Scenario  3   Empty   Default   Filled  (90%)   10%  -­‐   10%  -­‐  

17

Page  1  

Page  2  

Page  3  

Page  4  

 Page  5  

27  

Possible  Reasons    

•  Asking  for  registraEon    

•  Too  many  fields  to  fill  in    

•  Possible  usability  issue  

Total  number  of  users  

114  

   9  

18  

2  

   2  Total  Users  Lost  58  

Conversion Funnels – User Drop Rate

•  Some  opEons  not  enabled?  

•  Too  many  steps  possibly  

18

Product Management in Data Science

•  Correlation does not imply Causation •  Linear Vs. Non Linear Model •  Explicability and Actionability •  Predictive and Prescriptive

19

Typical challenges in Machine Learning •  Training Model

–  Train-test split –  Building ground truth

•  Not enough features –  Data Exchange with Clients –  Use external data such as BLS, Weather, Geo

•  Crowd-sourcing for validation –  Clients

•  Inbuilt auto-correction for learning models –  Outsourced

•  Mechanical Turks

20

Model Validation •  Explicability and Actionability

–  Regression –  Rule-based –  Deep Learning

•  Model Accuracy

–  Coverage –  Accuracy –  Recall/Sensitivity

•  Model Comparison –  ROC Curves

21

Curse of Dimensionality

•  Too many features or dimensions •  No of data points required grows exponentially with

features

•  Solution –  More data points, the merrier –  Regularization –  Feature selection using entropy

22

Getting started with Data Science

Tools

•  Excel •  RStudio •  Rweka •  Tableau •  CliqSense

Learning online •  Coursera – Machine Learning •  Udacity •  Caltech Online •  Kaggle

23

Exercise

hLps://en.wikipedia.org/wiki/AssociaEon_football    

24

Problem #1

How  do  you  measure  the  goodness    such  a  model?  

hLp://pgfplots.net/Ekz/examples/regression-­‐line/    

25

Problem #2

hLps://commons.wikimedia.org/wiki/File:Decision_tree_model.png    

How  do  you  measure  the  goodness    such  a  model?  

26

hLps://commons.wikimedia.org/wiki/File:Thank-­‐you-­‐word-­‐cloud.jpg