power to the people: a stack to empower every user to make data-driven decisions

41
1 Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Upload: looker

Post on 07-Jan-2017

96 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

1

Power to the People: A Stack to Empower Every User to

Make Data-Driven Decisions

Page 2: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Housekeeping

•  We will do Q&A at the end.

•  You should see a box on the right

side of your screen.

•  There is a button marked “Q&A” on

the bottom menu.

•  We are recording this

•  We will send you the recording & slides

tomorrow.

Recording Q&A

Page 3: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Zev Lebowitz Senior Sales Engineer

Daniel de Sybel CTO

Meet Our Presenters

Karol Ussher Head of Technology Partnerships, EMEA

Page 4: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

AGENDA

1.

2.

3.

Meet Google BigQuery

Meet Looker

Case Study: Data-driven Decisions at Infectious Media

Page 5: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Meet Google BigQuery

Page 6: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Google confidential Do not distribute

Page 7: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

What is Google BigQuery?

Durable and Highly Available

Convenience of SQL

Petabyte-scale Storage and Queries

Fully Managed, Serverless Enterprise Data Warehouse

Page 8: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

BigQuery for Enterprise Features

SQL Flat-rate Pricing Standard SQL

ODBC & JDBC

Connectors

DML Identity Access and Management

Stackdriver

Page 9: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Google confidential Do not distribute

2012 2013 2002 2004 2006 2008 2010

Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html

GFS

MapReduce

BigTable

Google Research in Data Technologies

Colossus

Dremel Flume

Megastore

Spanner

Millwheel

PubSub

F1

Page 10: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Now: Typical Big Data Tasks

Next: Big Data with Google

No-Ops Auto Everything

Analysis and Insights

Resource provisioning

Performance tuning

Monitoring

Reliability Deployment & configuration

Handling growing

scale

Utilization improvements

Analysis and Insights

Understanding

Page 11: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Google confidential Do not distribute

Think about the Data Warehouse

Laura

Dremel BigQuery

Page 12: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Confidential & Proprietary Google Cloud Platform 12

Analyze Store Capture

BigQuery (SQL)

Process

Cloud Dataflow (stream and batch)

Cloud Storage (objects)

Cloud Datastore (NoSQL)

BigQuery Storage

(structured)

Cloud Dataproc (Hadoop & Ecosystem)

Cloud Bigtable (NoSQL HBase)

Cassandra hBase MongoDB Rabbit MQ Kafka

Cloud 2.0

Cloud 3.0

Visualize

Cloud DataLab (iPython/Jupyter)

Looker

Pub/Sub Logs

BQ Streaming

App Engine

Cloud SQL (SQL)

Cloud Machine Learning

Focus on the Analysis not the Maintenance

Page 13: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Confidential & Proprietary Google Cloud Platform 13

"We are very excited about the productivity benefits offered by Cloud Dataflow and Cloud Pub/Sub. It took half a day to rewrite something that had previously taken over six months to build using Spark"

Paul Clarke, Director of Technology, Ocado

http://googlecloudplatform.blogspot.co.uk/2015/08/Announcing-General-Availability-of-Google-Cloud-Dataflow-and-Cloud-Pub-Sub.html

Page 14: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Confidential & Proprietary Google Cloud Platform 14

“Spotify chose Google in part because its services for analyzing large amounts of data, tools like BigQuery, are more advanced than data services from other cloud providers.” Nicholas Harteau, VP of Infrastructure, Spotify

https://labs.spotify.com/2016/02/25/spotifys-event-delivery-the-road-to-the-cloud-part-i/

Page 15: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Confidential & Proprietary Google Cloud Platform 15

“Right at the start of the partnership we were able to reduce time to insight from 96 hours to 30 minutes by using BigQuery.”

– Gary Sanders, Head of Digital Analytics, Lloyds Banking Group

Page 16: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Meet Looker

Page 17: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Makes it easy for everyone to find, explore and

understand the data that drives your

business.

A Data Analytics platform that...

Page 18: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

DATA BOTTLENECK

Which features increase

engagement?

What triggers a customer

churn?

Which web page works

best?

How is pipeline for

Q4?

Will we meet our revenue

targets?

Which customer is at

risk?

Which campaigns

convert best?

Which rep is converting

best?

Can we speed up our

operations? Are we

investing in the right area?

Who are our happiest

customers?

What industries are we doing

well in?

Where should we spend

more budget?

Page 19: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

DATA CHAOS

? ?

? ?

Page 20: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

IS THERE A WAY TO FIND BALANCE?

Standards

Scalability

Governance

Self-Service

Agility

Flexibility

Page 21: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

THE TECHNICAL PILLARS THAT MAKE IT POSSIBLE

100% In Database

Leverage all your data Avoid summarizing or

moving it

Modern Web Architecture

Access from anywhere Share and collaborate

Extend to anyone

LookML Intelligent Modeling Layer

Describe the data Create reusable and

shareable business logic

Page 22: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

LOOKER: A DATA PLATFORM

Find, explore and understand all the data

Explore Everything Find, explore and

understand all the data

Create Standards Define your data and

business metrics

Any SQL Database Analyze all of your data

where it is stored

Build a Data Culture Anyone can ask and

answer questions

How is pipeline for

Q4?

Will we meet our revenue

targets?

Which campaigns

convert best?

Which rep is converting

best?

Which customer is at

risk?

Can we speed up our

operations?

Page 23: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Looker - BigQuery Integration Highlights

In-Database Architecture

The power of BigQuery is directly leveraged by

Looker because all transformation is done in-

database

Support for Native BigQuery Functions

Integration with unique features to BigQuery in the product and modeling layer

make for a seamless integration.

Highest Level of Looker Features

We’ve invested in providing Looker features for BigQuery to make the best experience possible.

Page 24: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Data-Driven Decisions at Infectious Media

Page 25: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

OUR BUSINESS

●  Founded in 2008

●  Leading International Programmatic agency

●  Covering all biddable media

●  Activity live in 30+ markets

●  Highly customisable O&O technology stack – DMP & DSP

●  Transparent model

Page 26: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Impression Desk OUR DATA-DRIVEN ADVERTISING PLATFORM THAT PROVIDES FULL ACCESS TO THE FRAGMENTED LANDSCAPE OF INVENTORY AND DATA

BIDDER

BIDDERS

Page 27: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Data Processing •  4k requests / sec @ 1kb = 4Mbps

(0.4Tb / day) •  500k requests / sec @ 1kb = 0.5Gbps

(40Tb / day)

RTB: The Data Problem

Analytics •  Impression level data is a goldmine •  Anything that doesn’t fit in Excel

generally needs techie help

Page 28: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Infobright Community Edition •  Fantastic open source columnar database •  Could be easily installed in Amazon Web Services on a single server •  Used standard SQL for queries

Where we started...

Problems •  Concurrency wasn’t great •  Single threaded •  Could only manage around 1-2TB of data •  Data load could be slow

Page 29: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Infobright Enterprise Edition •  Simple upgrade path •  Multi-threaded •  Parallel data loads

Up next...

Problems •  Concurrency still wasn’t great •  Not cloud native •  Licence costs grew linearly with data volume

Page 30: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Hadoop •  Everyone else is doing it •  No licence costs •  Perfect for cloud deployment

From there...

Problems •  Analysts had to learn new ways of writing queries •  Concurrency was non-existent •  Server costs were difficult to control •  Took an army of infrastructure engineers to maintain it

Page 31: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Enter

Why? •  Probably processes the most data in the world •  No infrastructure engineers required •  Cloud native •  Oh, and…

Page 32: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Before BQ •  20 mins to query 1 month of data •  Stored < 5Tb of data •  1 infrastructure engineer to manage

server •  2 data engineers to manage data •  3 analysts to query data

Some Stats

After BQ •  2 mins to query 3 months of data •  Store > 50Tb of data •  0 infrastructure engineers (no-one

cares about the backend) •  1 data engineer to manage data •  6 analysts to query data

They cost the same!

Page 33: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Something missing

•  Optimisation managers still had to go to Analytics to ask questions •  Slowed down campaign optimisations and insights •  Led to impatience and frustration

Page 34: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

•  Elegant abstraction of our perfect DW via LookML •  Safe data exploration for Optimisers without needing Analysts •  Simple automated queries to email or import into Excel for clients •  Easy extension and evolution of data model with db •  Wait... user defined dashboards?

Enter

Page 35: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Optimisers looking to extend travel campaign to Paris

Compared Paris audience with existing London audience

Use insight to create new strategy

Sped up optimal campaign creation by a week

Audience Comparison

Page 36: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Dashboard can pinpoint problems on sites/exchanges

Identifying fraud/brand safety early reduces wasted spend

Problem sites/exchanges added to blocklists

Traders need to tackle arms race with fraudsters

Fraud and Brand Safety

Page 37: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Ongoing work

•  Costs have quickly increased  Built cost monitoring dash in Looker  Investigating flat rate pricing

•  Release of standard SQL  Has made queries faster  Requires a migration in LookML

•  Release of BigQuery regions  Allows better data governance  But creates problems for querying across region

Page 38: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Final thoughts

•  Scale is the constant enemy •  Scale makes even simple questions require smart

solutions •  BigQuery handles the scale most use Hadoop for •  Layering on Looker allows your team to get more

answers, not more problems

Page 39: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

Q&A

Page 40: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

THANK YOU FOR JOINING

Recording and slides will be posted.

We will email you the links tomorrow.

Our Next Webinar: Parse.ly & Looker

Beyond the Dashboard: What You Can Learn From Raw

Audience Data on Thursday

See how Google BigQuery and Looker work with your data.

Visit cloud.google.com/free-trial and looker.com/free-trial or

email [email protected].

Page 41: Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions

41

Thank you!