kafka and stream processing, taking analytics real-time, mike spicer

9
Kafka and Stream Processing, Taking Analytics Real-Time Mike Spicer - Lead Architect, IBM Streams

Upload: confluent

Post on 16-Apr-2017

838 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Kafka and Stream Processing, Taking Analytics Real-Time

Mike Spicer - Lead Architect, IBM Streams

Page 2: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Traditional Processing Stream Processing

Data Repository

Data Query

request

response

Real-Time Analytics

Data Results

Current fact finding

Analyze data in motion – before it is stored

Low latency paradigm, push model

Data driven: bring data to the analytics

Historical fact finding

Find and analyze information stored on disk

Batch paradigm, pull model

Query-driven: submits queries to static data

Page 3: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Stream Processing

Page 4: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

What Makes Kafka ideal for Stream ProcessingFAST –

•  A single Streams Kafka Source/Sink can Consume/Produce 100,000’s msgs/sec

SCALABLE – •  Partitioned Kafka Topics work with parallel Streams Kafka Sources

•  Parallel sources in the same Consumer group can consume 1,000,000’s msgs/sec

DURABLE – •  Kafka is distributed and replicated •  Messages are logged and replayable for a configured period •  Streams Kafka connectors support Guaranteed Processing

•  Source supports exactly once (& at least once) semantics •  Sink supports at least once semantics

A UNIVERSAL HUB – •  Hub connecting all applications and data sources •  Isolation between Producer and Consumer

Page 5: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Streaming Analytics Can Handle Many Use Cases

IBM Streams is being applied in many use cases –

•  Market and Customer Intelligence

•  Revenue, Upsell / Cross Sell

•  Personalized Customer Experience

•  Network Analytics

•  IoT, Connected Car and Telematics

•  National / Cyber Security, PII & PCI Data Leakage

•  Health and Improved Patient Outcomes

•  Operational Optimization

Watch the video

Watch the video

Watch the video

Watch the video Watch the video

Watch the video

Watch the video

Watch the video

Watch the video

Watch the video Watch the video

Watch the video

Watch the video

Watch the video

Insight Presentation Insight Presentation Insight Presentation Read the Case Study Read the Case Study

Read the Case Study

Read the Case Study

Read the Case Study Read the Case Study

Read the Case Study Insight Presentation Read the Case Study Insight Presentation

Read the Press Release Read the Abstract

Page 6: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Example Real-Time Analytics Use Cases

North American Telco Real Time Advertising – •  Click thru rate and Revenue up 50% •  ~30M in memory profiles, 500 SPSS models •  Purchases, Web click stream, CDRs, IPTV viewing,

Behavioral events •  Total events ~1.2B per day, 210K per second •  Average Latency 8ms

Thompson Reuters Eikon – •  News Ingest and Analytics

•  News, Market Data & Meta Data Streams to HBase •  Signal App: Real time Technical Analysis

•  Bollinger Band, Simple moving average, etc. •  VolSurf: Real time volatility surfaces

•  200k instruments, 100k msgs/sec

Multichannel

@

Website

Predictive Models Scoring, Segmentation, Analysis, Association

Target Advertising Platform (Campaign Management)

Transactions from all customers

Descriptive •  Age •  Gender •  Family situation •  Zip code

Transactions from this customer

•  Cardholder since YYYYMM •  Average transaction value •  Monthly transaction value •  Categories purchased •  Brands purchased

Interactions •  Web registration •  Web visits •  Customer service contacts •  Channel preference Attitudes •  Satisfaction scores •  Shopper type •  Eco score

Customers

Capture: Search keywords Page content Cookies

IP addresses Device info Actions within a window of time

In-Motion Behavior Analysis

Match with Global Id Map keywords to attributes and classification hierarchy

Invoke behavior models/scores

Advertisers

IBM Streams

Inges&on  Technology  

SDI  Data  (Metadata)  

Elektron  (Market  data)  

News  

Others…  IBM Streams

Page 7: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Real-Time Analytics from the Center to the Edge with Quarks for edge analytics on device or gateway –

•  Lightweight embedded streaming analytics runtime •  Analyze events locally on the edge •  Reduce communication costs by only sending relevant events

Device Hub – •  Device management •  Message broker (including MQTT & Kafka) •  Public device hub API supports custom device hub

IBM Streams for streaming analytics – •  High performance, full featured streaming analytics •  Build windows of state and correlate across devices •  Have access to data-of-record systems, e.g. medical history •  Control edge device based upon analytics •  Central job management/health summary •  Automatic application connectivity

Cluster

Gateway

Edge Device Edge

Device

Messaging (MQTT, Kafka etc.)

Page 8: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Real-Time Analytics – What Are You Waiting For?

The World is real-time, analyze it in real-time – • Acquire events as they happen • Analyze in real-time to detect and predict insights • Act immediately to change outcomes

Forrester Research described the following key takeaway in their recent Wave report – • All Data Is Born Fast “All data originates in a flash, whether it is from Internet-of-Things (IoT) devices, web clicks, transactions, or mobile app usage. But traditional analytics is done much, much later. Why wait? AD&D pros can use streaming analytics embedded in applications to get actionable value tout de suite. So what are you waiting for? Streaming analytics solutions can capture perishable insights on real-time data to bring immediate context to all IoT, mobile, web, and enterprise apps.”

The Forrester Wave™: Big Data Streaming Analytics Platforms, Q1 2016

The Forrester Wave is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave are trademarks of Forrester Research, Inc. The Forrester Wave is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.

Page 9: Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer

Thank You!