for business: what is a data scientist?

Post on 14-Jun-2015

739 Views

Category:

Business

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Slides of the course on big data by Clement Levallois from EMLYON Business School. For business students. Check the online video connected with these slides. -> The definition and profile of a data scientist is presented: hacker, math person and domain specialist.

TRANSCRIPT

MK99 – Big Data 1

Big data &

cross-platform analytics MOOC lectures Pr. Clement Levallois

MK99 – Big Data 2

What is a data scientist? [or, a guide for business to spot good ones and recruit them!]

MK99 – Big Data 4

+ Math and stats knowledge • Maths and stats are excellent foundations

• But a data scientist has a different mindset

– Focuses on accuracy of prediction, not causality

– Even if this is not “elegant” in terms of formal models

– Ready to use any bit of information available in the data (text, networks, …)

– See the slide deck on “Machine Learning” for details

MK99 – Big Data 5

+ Hacking skills • Ability to think “out of the box”

– As an econometrician, and as a computer scientist, as a computational linguist and a network analyst!

– Concerned with scale and speed

– Not dependent on packaged software • Aware of, and contributing to developments in open source • Following current developments in different academic fields

MK99 – Big Data 6

+ Substantive expertise

• Substantive expertise = grasp of the business logic – Many jumps of optimization come from a good knowledge

of the specificities of the domain

– These domains can be quite complex!

– Data scientists must be able to understand and translate these business specificities into their data models

MK99 – Big Data 7

A data scientist should be able to… – Discover interesting angles in the dataset

• You see worthless metadata? I see gold!

– Choose from a wide choice of techniques across social and natural sciences

• Statistics, machine learning, network analysis, natural language processing, etc. • From economics, physics, psychology, linguistics, computational science, genomics,

neuroscience, etc.

– Implement these techniques, possibly on large datasets

• Can you implement them in your programming language of choice? • Can you deal with large datasets (what if it doesn’t fit in memory?) • Can you be quick (and not ask for a couple of nights to run a script) • Can you be cheap (buying more hardware is not always a solution you can afford)

MK99 – Big Data 8

How to hire and keep a data scientist in your business?

1. Find them where they hang out: stackoverflow, github, specialized communities on Twitter. Good profiles are PhD students near graduation, and / or leading developers of open source projects.

2. Allow plenty of time for their personal development

– Contributing to open source projects, attending conferences, working on personal projects on their working hours

3. Treat them not as executioners, but as business co-developers

MK99 – Big Data 9

This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com)

Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

top related