demystifying the data scientist dan mcclary, ph.d. big data product management oracle note: the...
TRANSCRIPT
Demystifying the Data Scientist
Dan McClary, Ph.D.Big Data Product ManagementOracle
Note: The speaker notes for this slide include detailed instructions on how to customize this Title Slide with your own picture.
Tip! Remember to remove this text box.
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Data Scientists: By The Numbers
What’s a Data Scientist?
Do I need a Data Scientist?
How do I grow my own Data Scientist?
1
2
3
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What’s a Data Scientist
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What’s Data ScienceBuzzword or Essential Discipline?
• The buzz around “Data Science” is growing
• But isn’t it a bit like saying “chai tea?”
• What is a functional definition for data science?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
A Working Definition
• Data Science seeks to – Extract meaning from data– Create “data products”– Use all available data to tell a valuable story to non-practioners
• So what makes a Data Scientist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist
Statistical analysis
Scientific training
PhD in Computer Science? Statistics? Physics? Biology?!
Production-grade programmer in Java? Python? SQL
Business sensibility
Visualization
IT OperationsDatabases
Design Sensibility
Published researcher
BI Tools
Machine LearningPattern Recognition
Competitive IntelligenceHadoop
Big Data
Excellent Communicator/PresenterJavascript
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist
Does anyone like that even exist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Anatomy of a Data Scientist: Revised
Business
DataAnalytics
•Value Proposition•Goals•CommunicateResults
•Techniques•Interpretation•Model Requirements
•Integration•Manipulation•Quality Assurance
A person who has some degree of experience in each of
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Do I Need a Data Scientist?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Do You Need A Data Scientist?• Do you need an army of PhDs to solve machine learning problems?– Probably not
• Could you find more value in the data you do and can collect?– Undoubtedly
• Do you need people to find that value– Almost certainly
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Fitting for Data Scientists
• Where?– Kaggle.com – a community for Data Science• +100,000 members
– KDNuggets – forum for Data Mining and Data Science
• Who do I hire?– Some call themselves “data scientists,” but most call themselves• Mathematicians• Scientists• Reasearchers• Physicists
Who? Where? How Many?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Fitting for Data Scientists
• Most organizations will benefit from a few seasoned data scientists– Help transition to a more data-driven business– Direct efforts to integrate analytics more tightly with LoBs– Good understanding of how to tackle new problems
• Data scientists can be grown at home– Leverage the existing workforce– Provide growth opportunities for employees
How many do I need?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
How do I grow Data Scientists?
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #1: Find Motivated Individuals
• Developers who want to– Become more statistically oriented– Better understand business challenges
• Business Analysts who– Have some programming ability–Want to grow their technical capabilities
• All candidates should– Possess tremendous curiosity– Be able to self-manage
Sources for Good Candidates
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #2: Find Low-Hanging Fruit
• Find a project that has– High ROI– Limited, defined scope– Isn’t impossible
• Define– The business value– The time to invest
Analytically Important, Not Impossible
Valu
e to
Bus
ines
s
Time to Answer
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #3: Combine
• Add your data science team• And the well-defined project– Add a seasoned data scientist for best results
• Watch the team grow new skills• Evaluate the outcome– For the team members– For the business
And Iterate
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Step #4: Publish and PromoteShare Data Science Results as a Service
Data Scientist
Useful Derived Dataset
Anyone
Spark
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Summary• What is a Data Scientist– Someone who can help drive value through data
• Do you need one?– Possibly
• Can you grow a data scientist– Absolutely
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |