nrt event processing with snowplow

23
NRT Event Processing

Upload: dani-sola-lagares

Post on 15-Apr-2017

112 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: NRT Event Processing with Snowplow

NRT Event Processing

Page 2: NRT Event Processing with Snowplow

Outline• Introduction

• Our Snowplow Setup

• Example NRT Use Cases

• Radio Campaign

• Telephony System

Page 3: NRT Event Processing with Snowplow

Simply Business

• Largest UK business insurance provider

• More than 400.000 policy holders

• Using BML, tech and data to disrupt the

business insurance market

Page 4: NRT Event Processing with Snowplow

Data ’n’ Analytics

• 5 Data Engineers

• 3 Business Intelligence Developers

• 3 Data Analysts

• 1 Data Scientist

• 1 Director of Data Science

• And hiring! :-)

Page 5: NRT Event Processing with Snowplow

Our Snowplow Setup

Page 6: NRT Event Processing with Snowplow

Snowplow Setup

Trackers Collector Enrichment Modeling Storage

• Trackers, collectors and storage are 100% upstream Snowplow

• Enrichment:

• Spark apps that use scala-common-enrich as a library

• We add our own enrichments after the default ones

• We perform NRT identity stitching and sessionization

• Modeling: mix of Spark and SQL jobs

• Storage: Spark apps that use scala-hadoop-shred as a library

Page 7: NRT Event Processing with Snowplow

Why ?

• We wanted a near real-time pipeline, but KCL was too rigid:

• Provision, set up and monitor the machines

• Configuration is difficult for complex DAGs

• In contrast, Spark:

• Once set up, the cluster is a PaaS

• Allows streaming, batch, ML and graph workloads

• Allows analysts and data scientists to use Python

Page 8: NRT Event Processing with Snowplow

Radio Campaign

Page 9: NRT Event Processing with Snowplow

The Radio Campaign

• We’re running a radio campaign in Birmingham, Manchester and London

• People that get a quote starting from our radio landing pages get £25 discount

Page 10: NRT Event Processing with Snowplow

The Banner

• The questionnaire to get quotes can be quite long to complete

• We wanted to reassure our customers that they would get the

discount

• We wanted to display a banner at the top through all the pages of

the questionnaire

Page 11: NRT Event Processing with Snowplow

The Banner

Page 12: NRT Event Processing with Snowplow

Our InfrastructureSpark Stream

NRT EnrichmentScala Stream

Collector Kinesis

MongoDB

Visitor APIQuoting AppHTTP

On average, it takes 2.5s for an event to be available in the Visitor API

Page 13: NRT Event Processing with Snowplow

Benefits of NRT Snowplow

• Our quoting app does not need to know about marketing, user

landing pages, etc.

• Our Mongo table with active sessions’ events becomes a view of our

event log

• Can be reused for many other use cases: analytics on read!

Page 14: NRT Event Processing with Snowplow

Telephony System

Page 15: NRT Event Processing with Snowplow

Telephony System

• We have a call center in Northampton with around 200 consultants

• We used an off-the-shelf telephony system

• It worked well for a long time, but:

• Was not very well integrated with our systems

• Quite rigid, we couldn’t adapt it to all our needs

• We had daily reports and they contained aggregated data

Page 16: NRT Event Processing with Snowplow

Telephony System

• We decided to replace it with a home grown, Twilio-based solution

• Components:

• Contact Strategy Manager

• Voice Channel Manager

• Communication is event-based

• We transform those events into Snowplow’s unstructured

• Spark Streaming app to insert the events into Redshift every 2min

Page 17: NRT Event Processing with Snowplow

The InfrastructureSpark Stream

NRT EnrichmentScala Stream

Collector Kinesis Kinesis

Redshift

Spark StreamShredder

LookerContact Strategy Manager

Voice Channel Manager

EventTranslator

Page 18: NRT Event Processing with Snowplow

Events

Example call when viewed as sequence of events:

Page 19: NRT Event Processing with Snowplow

Benefits of NRT Snowplow

• Event Sourcing is great for reporting and analytics: ensures that

data quality remains high

• Team managers now have a NRT view of what teams are doing

• You can aggregate and drill down on the data as appropriate

• Leveraging our data platform: Snowplow pipeline, Redshift & Looker

• Leveraging our existing skills: everyone knows how to use Looker

Page 20: NRT Event Processing with Snowplow

Sum Up

Page 21: NRT Event Processing with Snowplow

The InfrastructureSpark Stream

NRT EnrichmentScala Stream

Collector Kinesis

MongoDB

Kinesis

Redshift

Spark StreamShredder

Visitor API LookerApplications

Page 22: NRT Event Processing with Snowplow

NRT Benefits

• We can dynamically alter the website while the user is still using it

• We can provide insights on live processes

• Multiple uses to improve conversion:

• Instant inclusion/exclusion from remarketing lists

• Abandoned cart emails/calls

• Social proofing (3 more people are also watching…)

• …

Page 23: NRT Event Processing with Snowplow

Questions?

@[email protected]