new professional careers in data

Post on 20-Feb-2017

150 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

…in data

new professional careers

Who am I?

• David Rostcheck

• I’m a consulting data scientist

• Follow my articles on LinkedIn

We will talk about 4 things:

Big Data DataScience

Data Engineering

BusinessIntelligence

BIG DATA

What is big data?

is data that isso big

that it

requiresspecialized techniques

to handle

like: clusters

or cloud computing

or graph algorithms

Data may

change rapidly

so big data may also be fast data

big data requires

specialized tools

to handle

MAP/REDUCE

big data tools are in demand

but

keep your perspective

Big Data tools can be complex

It is often easier to solve problems at small scale, then scale up, if possible

remember:

not all companies use big data

but

all companies use data

DATA SCIENCE

What is data science?

Data science is

industrial research

on a company’s

own data

What is its goal?

to produce

advanced algorithms

that deliver a

competitive advantage

data scientists often work with unstructured data

… which can be large

“The qualifications for the job include the strength to tunnel through mountains of information and the vision to discern patterns where others see none”

- Bloomberg Businessweek

Is data science really science?

let’s compare…

academic science data science

Teams PhDs, graduate students

PhDs, technologists

Setting University Company

Publication Formal (academic publications, conferences)

Less formal (blogs, white papers, open source)

Funding Public grants Corporate

Goal Advance human knowledge

Create competitive advantage

Data science is industrial science

It shares some attributes with academic science, but has other differences

What kind of work do data scientists do?

data scientists create artificially intelligent systems

these are often called “narrow AI”

examples

•Recommender systems•Self-driving cars•AI agents•Smart energy management•Medical diagnosis•Machine vision

DATA ENGINEERING

What is data engineering?

data engineering is a specialized kind of

software engineering

with additional skills in

handling and processing data

data science vs. data engineering

data science data engineering

Approach Scientific (Exploration) Engineering (Development)

Problems Unbounded Bounded

Path to Solution Iterative, exploratory, nonlinear Mostly linear

Education More is better (PhD’s common) BS and/or self-trained

Presentation Skills Important Not as important

Research experience

Important Not as important

Programming skills Not as important Important

Data skills Important Important

What kind of special training does a data engineer need?

Data storage and processing– structured: (SQL) – unstructured (NoSQL) – Big Data (Hadoop, Apache Spark/Storm/Flink, cloud)

Data visualization

Machine Learning algorithms and platforms (ex. Dato)

Predictive APIs (ex. Watson)

Does a data engineer need more math than a regular software engineer?

It really helps.

Linear algebra & calculus are important to understand machine learning

BUSINESS INTELLIGENCE

Wait – aren’t data science and business intelligence really the same thing?

Maybe. Let’s compare…

business intelligence (BI) data science

Data analysis Yes Yes

Statistics Yes Yes

Visualization Yes Yes

Data Sources Usually SQL, often Data Warehouse

Less structured (logs, cloud data, SQL, noSQL, text)

Tools Statistics, Visualization Statistics, Machine Learning, Graph Analysis, NLP

Focus Present and past Future

Approach Analytic Scientific

Goal Better strategic decisions Advanced functionality

The two fields are closely related.

In some ways data science is an evolution of business intelligence.

which industries most use data-focused jobs?

right now:

Technology Education

FinanceConsultingHealth Care

( Technology employs over 50% of data workers)

but...

“Technology” companies like Uber, Amazon, AirBnB

compete in other industries (transportation,

retail, hotels)

“Software is eating the world”

– Andreessen Horowitz

which industries will AI change?

Ultimately, all of them.

Incorporating AI is a large business opportunity

data jobs are in demand

• “The hot job of the decade… Data scientists today are akin to Wall Street “quants” of the 1980s and 1990s”

- Harvard Business Review

• “18.7% projected growth 2010-2020”- VentureBeat

• “McKinsey projects […] ‘50 percent to 60 percent gap between supply and requisite demand’”

- Bloomberg Businessweek

On the other hand…

Some people believe data jobs themselves will be automated:

“New Teradata Platform Reduces Demand For Data Scientists”

- Forbes

“Automating the Data Scientist”- MIT Technology Review

What do we think?

• Yes, advanced tools will automate some data exploration

• But: research and communication are fundamental skills and are always in demand when the world is changing

• Data will continue to explode (Internet of Things)

• We will see more change and faster change

education for data jobs

options include:

academic programs,boot camps,

and online classes (Coursera ,

Udacity)

for data engineering:

– documentation and webinars (self-education)

– focus on data manipulation tools and machine learning

for data science:

– The more academic science and research expertise, the better

– Focus on projects that solve unknown problems

– Work with more experienced data scientists

Questions?

?Contact: drostcheck@leopardllc.com, twitter: @davidrostcheckArticles: http://linkedin.com/in/davidrostcheck

top related