for business: what is a data scientist?

9
MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois

Upload: clement-levallois

Post on 14-Jun-2015

739 views

Category:

Business


2 download

DESCRIPTION

Slides of the course on big data by Clement Levallois from EMLYON Business School. For business students. Check the online video connected with these slides. -> The definition and profile of a data scientist is presented: hacker, math person and domain specialist.

TRANSCRIPT

Page 1: For business: what is a data scientist?

MK99 – Big Data 1

Big data &

cross-platform analytics MOOC lectures Pr. Clement Levallois

Page 2: For business: what is a data scientist?

MK99 – Big Data 2

What is a data scientist? [or, a guide for business to spot good ones and recruit them!]

Page 4: For business: what is a data scientist?

MK99 – Big Data 4

+ Math and stats knowledge • Maths and stats are excellent foundations

• But a data scientist has a different mindset

– Focuses on accuracy of prediction, not causality

– Even if this is not “elegant” in terms of formal models

– Ready to use any bit of information available in the data (text, networks, …)

– See the slide deck on “Machine Learning” for details

Page 5: For business: what is a data scientist?

MK99 – Big Data 5

+ Hacking skills • Ability to think “out of the box”

– As an econometrician, and as a computer scientist, as a computational linguist and a network analyst!

– Concerned with scale and speed

– Not dependent on packaged software • Aware of, and contributing to developments in open source • Following current developments in different academic fields

Page 6: For business: what is a data scientist?

MK99 – Big Data 6

+ Substantive expertise

• Substantive expertise = grasp of the business logic – Many jumps of optimization come from a good knowledge

of the specificities of the domain

– These domains can be quite complex!

– Data scientists must be able to understand and translate these business specificities into their data models

Page 7: For business: what is a data scientist?

MK99 – Big Data 7

A data scientist should be able to… – Discover interesting angles in the dataset

• You see worthless metadata? I see gold!

– Choose from a wide choice of techniques across social and natural sciences

• Statistics, machine learning, network analysis, natural language processing, etc. • From economics, physics, psychology, linguistics, computational science, genomics,

neuroscience, etc.

– Implement these techniques, possibly on large datasets

• Can you implement them in your programming language of choice? • Can you deal with large datasets (what if it doesn’t fit in memory?) • Can you be quick (and not ask for a couple of nights to run a script) • Can you be cheap (buying more hardware is not always a solution you can afford)

Page 8: For business: what is a data scientist?

MK99 – Big Data 8

How to hire and keep a data scientist in your business?

1. Find them where they hang out: stackoverflow, github, specialized communities on Twitter. Good profiles are PhD students near graduation, and / or leading developers of open source projects.

2. Allow plenty of time for their personal development

– Contributing to open source projects, attending conferences, working on personal projects on their working hours

3. Treat them not as executioners, but as business co-developers

Page 9: For business: what is a data scientist?

MK99 – Big Data 9

This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com)

Contact Clement Levallois (levallois [at] em-lyon.com) for more information.