devsci: improving software through data scienceskill 2016 change skill 2016 change. what’s the...

Post on 04-Jun-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Science:Becoming a Data-driven Organization

@MatthewRenze

#MicrosoftUAE

Are you a decision maker?

Are you flooded with data?

Are you a decision maker?

Are you flooded with data?

Are you a decision maker?

Are you using data science?

Why is it important?

What is data science?

How do I get started?

What is data science?

Why is it important?

How do I get started?

Why is data science important?

Job Postings for Data Scientists

Source: Dice Salary Survey 2017

Top-paying Tech SkillsSkill 2016 Change Skill 2016 Change

What’s the problem?

What’s the problem?

The Current State of Business

Don’t understand customersLack of product-market fit

Unused / low-value features

Missed market opportunities

Human biases

Guesswork

Cost of labor

Human errors

Three Main Approaches

Createbetter

products

Makesmarter

decisions

Reduce laborcosts

Three Main Approaches

Createbetter

products

Three Main Approaches

Makesmarter

decisions

Three Main Approaches

Reduce laborcosts

Three Main Approaches

Createbetter

products

Makesmarter

decisions

Reduce laborcosts

What is data science?

Computer

science

Math and

statistics

Domain

knowledge

Data

science

Data

engineering

Scientific

method

Data

science

Data Knowledge Decision Action

What Is a Data Scientist?

Performs data science

More than a scientist

More than an analyst

More than a developer

What skills are necessary?

Data Science Skills

Programming

Working with data

Descriptive statistics

Data visualization

Data Science Skills

Programming

Working with data

Descriptive statistics

Data visualization

Statistical modeling

Handling Big Data

Machine learning

Deploying to production

What tools are used?

70%

60%

40%

30%

20%

10%

0%

50%

SQ

L

Exc

el

Pyt

ho

n

MyS

QLR

Pyt

ho

n t

oo

ls

gg

plo

t

SQ

L Serv

er

Tab

leau

Java

Scr

ipt

Matp

lotl

ib

Java

Po

stg

reSQ

L

Ora

cle

D3

Ho

meg

row

n

Hiv

e

Sp

ark

Clo

ud

era

Vis

ual B

asi

c

Mo

ng

oD

B

Had

oo

p

SA

S

C+

+

Sca

la

Po

werP

ivo

t

SQ

Lite C

Pig

Red

Sh

ift

Weka

Hb

ase

(EM

R)

Perl

SP

SS

Tera

data

Tool: language, platform, analytics

Sh

are

of

Resp

on

den

ts

Source: O’Reilly 2015 Data Science Salary Survey

Data Science Tools

70%

60%

40%

30%

20%

10%

0%

50%

SQ

L

Exc

el

Pyt

ho

n

MyS

QLR

Pyt

ho

n t

oo

ls

gg

plo

t

SQ

L Serv

er

Tab

leau

Java

Scr

ipt

Matp

lotl

ib

Java

Po

stg

reSQ

L

Ora

cle

D3

Ho

meg

row

n

Hiv

e

Sp

ark

Clo

ud

era

Vis

ual B

asi

c

Mo

ng

oD

B

Had

oo

p

SA

S

C+

+

Sca

la

Po

werP

ivo

t

SQ

Lite C

Pig

Red

Sh

ift

Weka

Hb

ase

(EM

R)

Perl

SP

SS

Tera

data

Tool: language, platform, analytics

Sh

are

of

Resp

on

den

ts

Source: O’Reilly 2015 Data Science Salary Survey

Data Science Tools

70%

60%

40%

30%

20%

10%

0%

50%

SQ

L

Exc

el

Pyt

ho

n

MyS

QLR

Pyt

ho

n t

oo

ls

gg

plo

t

SQ

L Serv

er

Tab

leau

Java

Scr

ipt

Matp

lotl

ib

Java

Po

stg

reSQ

L

Ora

cle

D3

Ho

meg

row

n

Hiv

e

Sp

ark

Clo

ud

era

Vis

ual B

asi

c

Mo

ng

oD

B

Had

oo

p

SA

S

C+

+

Sca

la

Po

werP

ivo

t

SQ

Lite C

Pig

Red

Sh

ift

Weka

Hb

ase

(EM

R)

Perl

SP

SS

Tera

data

Tool: language, platform, analytics

Sh

are

of

Resp

on

den

ts

Source: O’Reilly 2015 Data Science Salary Survey

Data Science Tools

70%

60%

40%

30%

20%

10%

0%

50%

SQ

L

Exc

el

Pyt

ho

n

MyS

QLR

Pyt

ho

n t

oo

ls

gg

plo

t

SQ

L Serv

er

Tab

leau

Java

Scr

ipt

Matp

lotl

ib

Java

Po

stg

reSQ

L

Ora

cle

D3

Ho

meg

row

n

Hiv

e

Sp

ark

Clo

ud

era

Vis

ual B

asi

c

Mo

ng

oD

B

Had

oo

p

SA

S

C+

+

Sca

la

Po

werP

ivo

t

SQ

Lite C

Pig

Red

Sh

ift

Weka

Hb

ase

(EM

R)

Perl

SP

SS

Tera

data

Tool: language, platform, analytics

Sh

are

of

Resp

on

den

ts

Source: O’Reilly 2015 Data Science Salary Survey

Data Science Tools

70%

60%

40%

30%

20%

10%

0%

50%

SQ

L

Exc

el

Pyt

ho

n

MyS

QLR

Pyt

ho

n t

oo

ls

gg

plo

t

SQ

L Serv

er

Tab

leau

Java

Scr

ipt

Matp

lotl

ib

Java

Po

stg

reSQ

L

Ora

cle

D3

Ho

meg

row

n

Hiv

e

Sp

ark

Clo

ud

era

Vis

ual B

asi

c

Mo

ng

oD

B

Had

oo

p

SA

S

C+

+

Sca

la

Po

werP

ivo

t

SQ

Lite C

Pig

Red

Sh

ift

Weka

Hb

ase

(EM

R)

Perl

SP

SS

Tera

data

Tool: language, platform, analytics

Sh

are

of

Resp

on

den

ts

Source: O’Reilly 2015 Data Science Salary Survey

Data Science Tools

How is data science performed?

The Data Science Process

Data

The Data Science Process

Find a question

Data

The Data Science Process

Find a question

Collectthe data

Data

The Data Science Process

Find a question

Collectthe data

Preparethe data

Data

The Data Science Process

Find a question

Collectthe data

Preparethe data

Create a model

Data

The Data Science Process

Find a question

Collectthe data

Preparethe data

Create a model

Evaluatethe model

Data

The Data Science Process

Find a question

Collectthe data

Preparethe data

Create a model

Evaluatethe model

Deploythe model

Data

The Data Science Process

Find a question

Collectthe data

Preparethe data

Createa model

Evaluatethe model

Deploythe model

Data

The Data Science Process

Iterative process

Find a question

Explore the data

Prepare the data

Create a model

Evaluate the

model

Deploy the

model

Data

The Data Science Process

Iterative process

Non-sequential

Find a question

Explore the data

Prepare the data

Create a model

Evaluate the

model

Deploy the

model

Data

The Data Science Process

Iterative process

Non-sequential

Early termination

Find a question

Explore the data

Prepare the data

Create a model

Evaluate the

model

Deploy the

model

Data

How do I get started?

What are the ingredients ofa data-driven enterprise?

Strategy

People

DataTechnology

Culture

Strategy

People

Data

Technology

Culture

What is the process of becominga data-driven enterprise?

AI

Predict

Analyze

Organize

Collect

AI

Predict

Analyze

Organize

Collect

Need

s

1. Collect

Collect

1. Collect

Transactions

Logging

Digitization

Collect

1. Collect

Transactions

Logging

Digitization

Telemetry

Experiments

External dataCollect

2. Organize

Organize

Collect

2. Organize

Transform

Clean

Store

Organize

Collect

2. Organize

Transform

Clean

Store

Data ETL

Data Warehouse

Data Lake

Organize

Collect

3. Analyze

Analyze

Organize

Collect

3. Analyze

Reports

Dashboards

KPI monitors

Analyze

Organize

Collect

3. Analyze

Reports

Dashboards

KPI monitors

Data mining

Descriptive analytics

Diagnostic analytics

Analyze

Organize

Collect

4. Predict

Predict

Analyze

Organize

Collect

4. Predict

Predictive analytics

Prescriptive analytics

Machine learning

Predict

Analyze

Organize

Collect

5. Automate AI

Predict

Analyze

Organize

Collect

5. Automate

Artificial intelligence

Reinforcement learning

Deep learning

AI

Predict

Analyze

Organize

Collect

AI

Predict

Analyze

Organize

Collect

AI

Predict

Analyze

Organize

Collect

Advice for Success

Get buy-in from leadership

Focus on low-hanging fruit

Don’t silo data science teamsDemocratize your data

Advice for Success

Get buy-in from leadership

Focus on low-hanging fruit

Don’t silo data science teamsDemocratize your data

Embrace smart failure

Focus on feedback

Embed data collection

Avoid the Observer Effect

Where to Go Next?

Where to Go Next

Data Camp: https://www.datacamp.com

Pluralsight: https://www.pluralsight.com

Coursera: https://www.coursera.org

www.pluralsight.com/authors/matthew-renze

Pluralsight Courses

Data Science: The Big Picture

Data Science with R

Exploratory Data Analysis with R

Data Visualization with R (3-part)

Deep Learning: The Big Picture

https://www.pluralsight.com/authors/matthew-renze

www.matthewrenze.com

Feedback

Very important to me!

What did you like?

What could I improve?

Conclusion

Why is it important?

What is data science?

How do I get started?

Is your organization?

Are you prepared?

Is our world prepared?

Thank You!

Matthew Renze

Data Science Consultant

Renze Consulting

Twitter: @matthewrenze

Email: info@matthewrenze.com

Website: www.matthewrenze.com

top related