data science 101

72
Data Science 101 A Love Story

Upload: maren

Post on 24-Feb-2016

70 views

Category:

Documents


0 download

DESCRIPTION

Data Science 101. A Love Story. Agenda. Introduction to Data Science Who’s who in Data Science? That Data Science Life. [Case Study] How Spotify manages their data. [VM] The Data Science life at VaynerMedia. Conclusions. “If you can measure it, you can hack it.”. E -> A -> E. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Science 101

Data Science 101

A Love Story

Page 2: Data Science 101

Agenda

• Introduction to Data Science• Who’s who in Data Science?• That Data Science Life.• [Case Study] How Spotify manages their data.• [VM] The Data Science life at VaynerMedia.• Conclusions.

Page 3: Data Science 101

“If you can measure it, you can hack it.”

E -> A -> E

Page 4: Data Science 101

We’re generating (and tracking) exponentially more data online than ever before.

Page 5: Data Science 101

Big Data is big.

Page 6: Data Science 101

5,000,000,000 GB/2 Days

Page 7: Data Science 101

We’re always playing catch-up.

Page 8: Data Science 101

“Innovative Solutions” >

“Industry Standards”

Page 9: Data Science 101

Data Scientists are “Innovative Problem Solvers”

Page 10: Data Science 101

I get it. “Big Data” is real, and Data Scientists are

awesome.

Page 11: Data Science 101

But what is a Data Scientist? Who are they, and

how do they work with “Big Data”?

Page 12: Data Science 101
Page 13: Data Science 101

VM

Page 14: Data Science 101

DJ Patil is a huge influencer in this space.

Page 15: Data Science 101

Why is DJ Patil so popular?

Page 16: Data Science 101

LinkedIn and People You May Know

Page 17: Data Science 101

Angel has 2 mutual friends with Vikash.Tim has 20 mutual friends with

Vikash.If John is friends with Vikash, he might know Tim and his mutual friends.

Page 18: Data Science 101

This increased platform usage, making the experience on LinkedIn more valuable.

Page 19: Data Science 101

Active Users = selling point for LinkedIn when pitching to Brands.

Page 20: Data Science 101

Leg up to users looking for employment in the informal job market.

Page 21: Data Science 101

Big Data.Real Business objective.

Simple Analysis.Valuable Data-driven Product.

Page 22: Data Science 101

“Patil Effect”

Page 23: Data Science 101

VM analysts do the same thing, we just don’t use the same tools.

Page 24: Data Science 101
Page 25: Data Science 101

10^100

Page 26: Data Science 101

Google started downloading the entire internet in the late 90s-early 00s.

Page 27: Data Science 101

“It’s not you, it’s me.”- Google

Page 28: Data Science 101

Google created a better way to process Big Data. They created MapReduce.

Page 29: Data Science 101

Yahoo! wanted to download the internet too.

Page 30: Data Science 101

They liked MapReduce so much that they created Hadoop.

Page 31: Data Science 101
Page 32: Data Science 101

Hadoop is an open sourced distributed file system technology built using MapReduce.

Page 33: Data Science 101
Page 34: Data Science 101
Page 35: Data Science 101
Page 36: Data Science 101
Page 37: Data Science 101

Developed by the folks over at Facebook.

Page 38: Data Science 101

Hive is a data “warehouse” tool built to query Hadoop systems.

Page 39: Data Science 101

Querying this data also allows us to work on our data retrieval skills.

Page 40: Data Science 101

Less time cleaning data.Less time “fishing”.Less spreadsheets.

BOOM.

Page 41: Data Science 101
Page 42: Data Science 101

Amazon Web Services makes computing data in the cloud easy and cheap.

Page 43: Data Science 101

No need for huge data centers on site.

Page 44: Data Science 101

Pay for what you use.

Page 45: Data Science 101

Makes it easy to move data around in the cloud.

Page 46: Data Science 101

How does a company actually use all of these cool tools?

Page 47: Data Science 101
Page 48: Data Science 101

Spotify Client

AWS EMR(Hadoop)

PostgreSQL

Hive (data warehouse infrastructure; SQL-like

syntax)

AdHoc MapReduce

Jobs

Page 49: Data Science 101

How does all of this fit in to VaynerMedia?

Page 50: Data Science 101

VM

Page 51: Data Science 101

Where do analysts fall under the VM umbrella?

Page 52: Data Science 101

Optimizing Content.Optimizing Ad Spends.

Understanding Overall Trends.

Page 53: Data Science 101

We could also develop data-driven products.

Page 54: Data Science 101

Business Objective (s):

-How are we doing against our competitors/ourselves?

-How is our content performing this week?

Page 55: Data Science 101

Math Skills: How do we calculate engagements appropriately? What are my KPIs?

Page 56: Data Science 101

Hacking Skills: How do I get a hold of all of the public data needed for the analysis?

Page 57: Data Science 101

We can also apply a similar methodology to ads.

Page 58: Data Science 101
Page 59: Data Science 101

Trending topics in real time.

Page 60: Data Science 101

Big Picture

Page 61: Data Science 101

Top Phrases available in API in real time.

Page 62: Data Science 101

Demo information is also available.

Page 63: Data Science 101

Other data points attached to stories.

Page 64: Data Science 101

Using the Bit.ly API, we can pull all of this data.Using R, we can analyze the data.

Page 65: Data Science 101

We can adjust our targeting buckets in real time.

Page 66: Data Science 101

Doesn’t matter what we do, as long we develop our core skills.

Page 67: Data Science 101

All of the cool tools that large companies use aren’t necessary for us to be called “Data scientists”.

Page 68: Data Science 101

A carpenter isn’t judged by the tools he uses, but by the things he builds.

Page 69: Data Science 101

Data Science is a method of problem solving.

Page 70: Data Science 101

We are Data scientists.

Page 71: Data Science 101
Page 72: Data Science 101

Questions/Comments?