demystifying the data scientist dan mcclary, ph.d. big data product management oracle note: the...

21

Upload: willis-moore

Post on 04-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions
Page 2: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Demystifying the Data Scientist

Dan McClary, Ph.D.Big Data Product ManagementOracle

Note: The speaker notes for this slide include detailed instructions on how to customize this Title Slide with your own picture.

Tip! Remember to remove this text box.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Page 3: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Scientists: By The Numbers

What’s a Data Scientist?

Do I need a Data Scientist?

How do I grow my own Data Scientist?

1

2

3

Page 4: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

What’s a Data Scientist

Page 5: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

What’s Data ScienceBuzzword or Essential Discipline?

• The buzz around “Data Science” is growing

• But isn’t it a bit like saying “chai tea?”

• What is a functional definition for data science?

Page 6: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

A Working Definition

• Data Science seeks to – Extract meaning from data– Create “data products”– Use all available data to tell a valuable story to non-practioners

• So what makes a Data Scientist?

Page 7: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist

Statistical analysis

Scientific training

PhD in Computer Science? Statistics? Physics? Biology?!

Production-grade programmer in Java? Python? SQL

Business sensibility

Visualization

IT OperationsDatabases

Design Sensibility

Published researcher

BI Tools

Machine LearningPattern Recognition

Competitive IntelligenceHadoop

Big Data

Excellent Communicator/PresenterJavascript

Page 8: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist

Does anyone like that even exist?

Page 9: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist: Revised

Business

DataAnalytics

•Value Proposition•Goals•CommunicateResults

•Techniques•Interpretation•Model Requirements

•Integration•Manipulation•Quality Assurance

A person who has some degree of experience in each of

Page 10: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Do I Need a Data Scientist?

Page 11: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Do You Need A Data Scientist?• Do you need an army of PhDs to solve machine learning problems?– Probably not

• Could you find more value in the data you do and can collect?– Undoubtedly

• Do you need people to find that value– Almost certainly

Page 12: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Fitting for Data Scientists

• Where?– Kaggle.com – a community for Data Science• +100,000 members

– KDNuggets – forum for Data Mining and Data Science

• Who do I hire?– Some call themselves “data scientists,” but most call themselves• Mathematicians• Scientists• Reasearchers• Physicists

Who? Where? How Many?

Page 13: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Fitting for Data Scientists

• Most organizations will benefit from a few seasoned data scientists– Help transition to a more data-driven business– Direct efforts to integrate analytics more tightly with LoBs– Good understanding of how to tackle new problems

• Data scientists can be grown at home– Leverage the existing workforce– Provide growth opportunities for employees

How many do I need?

Page 14: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

How do I grow Data Scientists?

Page 15: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #1: Find Motivated Individuals

• Developers who want to– Become more statistically oriented– Better understand business challenges

• Business Analysts who– Have some programming ability–Want to grow their technical capabilities

• All candidates should– Possess tremendous curiosity– Be able to self-manage

Sources for Good Candidates

Page 16: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #2: Find Low-Hanging Fruit

• Find a project that has– High ROI– Limited, defined scope– Isn’t impossible

• Define– The business value– The time to invest

Analytically Important, Not Impossible

Valu

e to

Bus

ines

s

Time to Answer

Page 17: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #3: Combine

• Add your data science team• And the well-defined project– Add a seasoned data scientist for best results

• Watch the team grow new skills• Evaluate the outcome– For the team members– For the business

And Iterate

Page 18: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #4: Publish and PromoteShare Data Science Results as a Service

Data Scientist

Useful Derived Dataset

Anyone

Spark

Page 19: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Summary• What is a Data Scientist– Someone who can help drive value through data

• Do you need one?– Possibly

• Can you grow a data scientist– Absolutely

Page 20: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Page 21: Demystifying the Data Scientist Dan McClary, Ph.D. Big Data Product Management Oracle Note: The speaker notes for this slide include detailed instructions