streaming overview and contenders programming languages...

37
© 2015 IBM Corporation 1 Streaming Meetup 30 August 2016 Roger Rea, IBM Streams Offering Manager Streaming Analytics and Python Streaming overview and contenders Programming Languages: SPL and Python

Upload: others

Post on 16-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation1

Streaming Meetup

30 August 2016Roger Rea, IBM Streams Offering Manager

Streaming Analytics and Python

Streaming overview and contenders

Programming Languages: SPL and Python

Page 2: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation2

Why are we here?

Pizza?

Reese’s Peanut butter cups?Two fast growing trends are coming together - streaming analytics and Python. It's like peanut butter

and chocolate!

Streaming analytics is a superset of complex event processing, with clustered runtimes to support

greater volume of events, ability to analyze unstructured data and more expressive programming

paradigms. And very low latency to enable real time analytic processing.

Python is a widely used high-level, general-purpose, interpreted, dynamic programming language. Its

design philosophy emphasizes code readability, and its syntax allows programmers to express

concepts in fewer lines of code than possible in languages such as C++ or Java. The language provides

constructs intended to enable clear programs on both a small and large scale.

This meet up will provide an overview and comparison of many different contenders in the fast growing

streaming analytics space, then show a demo of IBM Streams technology allowing programs written

completely in Python to call Streams libraries, and then deploy those apps to the Streams runtime.

Page 3: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation3

3

Speaker Biography

Roger Rea leads the cross functional team for marketing, sales, development,

services, product management and support for IBM Streams within Analytics

Platform Services. Prior to this assignment, Roger held a variety of sales,

technical, educational, marketing and management jobs at IBM, Skill

Dynamics and Tivoli Systems.

Roger earned a Bachelor of Science in Mathematics and Computer Science,

cum laude, from the University of California at Los Angeles (UCLA). He has

also received a Masters' Certificate in Project Management from George

Washington University.

Roger lives in Cary, North Carolina, USA with his wife and two children and

enjoys skiing, kayaking, reading, cooking and singing in his church choir.

Roger ReaIBM Streams Senior Offering Manager

[email protected], 1-919-345-7386Paste Photo here

EnglandWales London

BirminghamRea River

Rhea and Rea evolved in Britain from the ancient

Welsh word “Rhe” meaning “rapid stream.”

Page 4: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation4

Audience Biography

Developers?

What languages?

•Python, Java, C/C++/C#, SPL, Ruby, ??

Data Scientists?

What tools?

•R, SPSS, SAS, WEKA, MATLAB, ?

Page 5: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation5

What is Streaming Analytics?

Software that can filter, aggregate, enrich,

and analyze a high throughput of data from

multiple, disparate live data sources and in

any data format to identify simple and

complex patterns to provide applications with

context to detect opportune situations,

automate immediate actions, and

dynamically adapt.

Page 6: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation6

Time is ripe for a new era of computing

Emerging trends create need for new languages

Scientific programming Fortran

Business programming Cobol

Systems programming at higher level C

Increased productivity C++

Database programming SQL

Web programming Java

Data scientist Python

Streaming data and multicore architectures

Streams Processing Language

Stored data and multicore architectures

Hadoop, Map-Reduce, Spark

Page 7: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation7

Who delivers Streaming Analytics?

The Forrester Wave™: Big Data Streaming

Analytics Platforms, Q1 2016

Market Report Paper by Bloor, Author Ronnie Beggs

Publish date June 2016 Streaming analytics 2016

Page 8: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation8

The Forrester Wave™:

Big Data Streaming Analytics, Q1 2016

Page 9: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation9

Stream Computing

Open

Sourc

e

Exte

nsib

le p

latfo

rm

Managed S

erv

ice

Batc

h &

Stre

am

ing

Com

mand L

ine i/fa

ce

Web &

JM

X m

gm

t

At L

east O

nce

Exactly

one

Sta

te

Win

dow

s

Back p

ressure

Machin

e L

earn

ing

Model s

corin

g

Vid

eo/Im

age

Geospatia

l

Text A

naly

tics

Vis

ual d

evelo

pm

ent

Auto

mate

d H

A

Ente

rpris

e a

dapte

rs

Open s

ourc

eadapte

rs

Esper

IBM Streams

Storm

Flink

Spark Streaming

Dataflow

Page 10: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation10

Integrated Development

EnvironmentScale-Out Runtime

Analytic Toolkits &

Adapters

Development and Management Functional and OptimizedFlexibility and Scalability

Cloud and on premise available for flexible deployment

IBM Streams Overview

Page 11: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation11

Streams next release 3Q16

• Apache Edgent support:

Java based Streaming analytics targeted at Internet of Things

market to deliver analytics at the edge

• Streams Rules:

Rules compiler to enable ODM Rules to run natively on Streams

for superior performance and low latency

• Python development:

Python developers can easily call APIs to Streams libraries which

are then compiled and deployed to Streams

Technical Foundation:

1. Speech to Text toolkit

2. Cybersecurity Toolkit enhancements

3. Submission time fusion of operators

4. Asynch non-blocking checkpointing

5. Streams consistent region using RDMA

Information regarding potential future products is intended to outline our

general product direction and it should not be relied on in making a

purchasing decision. The information mentioned regarding potential future

products is not a commitment, promise, or legal obligation to deliver any

material, code or functionality. Information about potential future products

may not be incorporated into any contract. The development, release, and

timing of any future features or functionality described for our products

remains at our sole discretion.

Page 12: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation12

12

IBM Streams: Overview of our best of breed programming model

Streams Processing Language (SPL)

Input(Source)

Output(Sink)

Process(Operators)

Platform optimized compilation

Meters

Usage

Model

Company

Filter

Usage

Contract

Text

Extract

Text

Extract

Degree

History

Compare

History

Temp

Action

Store

History

Season

Adjust

Daily

Adjust

Filter

Fuse

Cleanse

Classify

Analyze

Model

Act

Persist

Weather

Data

Operators:

- SPL or custom with

Java, C++, and now,

Python

- Compiled into

processing elements

(PE’s) for deployment

Page 13: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation13

IBM Streams at a glance

Hadoop

Data

Warehouse

Communications Data Sources

TCP/IP

UDP/IP

HTTP

FTP

RSS

Messaging Toolkit (Kafka, XMS, IBM

MQ, Apache ActiveMQ, RabbitMQ,

MQ TT, MQ Low Latency

Messaging)

IBM DataStage

IBM Data Replication

Functions:

• Filter

• Enrich

• Normalize

• Windowed Aggregations

• Machine Learning

• Scoring (SPSS, R, MLlib)

• CEP & Pattern Matching

• Geospatial

• Video/Image

• Text Analytics (AQL)

• Speech to Text

• IBM ODM Rules

IBM Streams

Scale-out Runtime

Hadoop: HDFS, GPFS, Hive, Hbase,

BigSQL, Parquet, Thrift, Avro

RDBMS: IBM DB2, IBM DB2 Parallel

writer, IBM Informix, IBM BigInsights

BigSQL, IBM Netezza,

IBM Netezza NZLoad, solidDB,

Oracle, Microsoft SQL Server, MySQL,

Teradata, Aster, HP Vertica

NoSQL:

Key Value Stores (Memcached, Redis,

Redis-Cluster, Aerospike)

Column Oriented Stores (Cassandra,

Hbase)

Document Oriented Stores (IBM

Cloudant, Mongo, Couchbase)

NoSQL

Page 14: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation14

14

IBM Streams: A pioneering platform rooted in real-time analytics since 2003[A technology hardened in the IBM Research labs for the first six years in collaboration with a quality conscious U.S

Government agency.] (It has been a fully supported IBM product since 2009. {v1.0 to v4.1 as of 2015})

Mining in Microseconds &

Statistics

Predictive

AdvancedMathematicalModels(IBM Research)

Natural Language

Processing

Geospatial

Acoustic(IBM Research and Open Source)

Entities & Relationships

Image & Video(Open Source)

Page 15: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation15

Development Environment

Integrated Development

Environment

Development and Management

Streams Processing Language

Visual Composition Tools

Page 16: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation16

IBM Streams: Development time terminology

Operator The fundamental building block of the Streams Processing

Language

Operators process data from streams and may produce new streams

Stream An infinite sequence of structured tuples

Can be consumed by operators on a tuple-by-tuple basis or through the definition of a window

Tuple A structured list of attributes and their types. Each tuple on

a stream has the form dictated by its stream type

Stream type Specification of the name and data type of each attribute in

the tuple

Window A finite, sequential group of tuples

Based on count, time, attribute value,or punctuation marks

directory:"/img"

filename:"farm"

directory:"/img"

filename:"bird"

directory:"/opt"

filename:"java"

directory:"/img"

filename:"cat"

Streams Application

stream

tuple

height:640

width:480

data:

height:1280

width:1024

data:

height:640

width:480

data:

operator

Page 17: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation17

Anatomy of an Operator Invocation Operators share a common structure

italics are sections to fill in

Reading an operator invocation

Declare a stream stream-name

With attributes from stream-type

that is produced by MyOperator

from the input(s) input-stream

MyOperator behavior defined by

logic, parameters, windowspec, and configuration; output

attribute assignments are specified in output

For the example:

Declare the stream Sale with the attribute item, which is a raw

(ASCII) string

Join the Bid and Ask streams with

sliding windows of 30 seconds on Bid, and 50 tuples of Ask

When items are equal, and Bid price is greater than or equal to

Ask price

Output the item value on the Sale stream

stream<stream-type> stream-name

= MyOperator(input-stream; …)

{

logic logic ;

window windowspec ;

param parameters ;

output output ;

config configuration ;

}

Syntax:

17

Example

stream<rstring item> Sale = Join(Bid; Ask){

window Bid: sliding, time(30);Ask: sliding, count(50);

param match : Bid.item == Ask.item&& Bid.price >= Ask.price;

output Sale: item = Bid.item;}

Page 18: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation18

IBM Streams: A rich set of data types to code powerful analytics and optimize performance

(any)

(composite)(primitive)

(collection) tupleboolean enum (numeric) timestamp (string) blob

list set maprstring ustring(integral) (floatingpoint) (complex)

(signed) (unsigned) (float) (decimal)

int8

int16

int32

int64

uint8

uint16

uint32

uint64

float32

float64

float128

decimal32

decimal64

decimal128

complex32

complex64

complex128

xml

User-defined types

type Integers = list<int32>;type MySchema = rstring s, Integers ls;

Page 19: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation19

Application

– Data flow graph of operator instances connected to

each other via stream connections

Operator

– Reusable stream analytic

Input ports: receives data / Output ports: produces data

Source: No input ports / Sink: No output ports

Operator Instance

– A specific instantiation of an operator

Stream

– Continuous series of tuples, generated by an operator instance’s output port

Stream connection

– A stream connected to a specific operator instance input port

Processing Element (PE)

– A runtime process that executes a set of operator instances

Job

– An application instance running on a set of hosts

O1

O2

O3

(stream<Type> A) as O1 = MySrc() {}

() as O2 = MySink(A) {}

() as O3 = MySink(A) {}

A

stream A

stream

connection

MySink

MySink

MySrc

IBM Streams: Runtime terminology

Page 20: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation20

IBM Streams: From operators to running jobs

Streams application graph:

A directed, possibly cyclic, graph

A collection of operators

Connected by streams

Each complete application is a potentially deployable job

Jobs are deployed to a Streams runtime environment, known as a Streams

Instance (or simply, an instance)

An instance can include a single processing node (hardware)

Or multiple processing nodes

Streams instance

OP

OP

Src

Src

Sink

Sink

OP

h/w node

node nodenode

node

node nodenode

Page 21: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation21

21

Linear Road

Data Feeder

(TCP or Kafka or File)

Position report

and accident

Analytics for East

and West traffic.

(Type 0 and 1)

Daily expenditure

Analytics

(Type 3)

Account balance

Analytics

(Type 2)

Result

notifications

End to end average

throughput

1.87K events per second

(20.2 Million total events in 3 hours)

Response time below 1 second Response time at 1

second

Type 0 responses: 98% Type 0 responses: 2%

Type 1 responses: 97.8% Type 1 responses: 2.2%

Type 2 responses: 98.5% Type 2 responses: 1.5%

Type 3 responses: 99.9% Type 3 responses: 0.1%

(Linear Road specification states 1 to 5 seconds as an acceptable response time)

Application

components

# of CPU

cores

Data feeder 1

Event receiver and router 1

Type 0 and Type 1 analytics 1

Type 2 analytics 1

Type 3 analytics 1

Type 0 result writer 1

Type 1 result writer 1

Memory Usage: 1.8GB CPU Utilization: 2%

Linear Road Benchmark: Streams application graph for 1 expressway

Page 22: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation22

Streams results

L-Rating 50 on one Azure node, 200 on 4

Azure nodes

1 node, 16 cores, nearly 1B events

4 nodes, 64 cores, nearly 4B events

Linear scalability

Handles bursty traffic

99% of responses sub-second

# of x-ways # of cars Entries Memory CPU

1 278973 19.2 Million 2.2 GB 2%

2 558726 38.5 Million 4.5 GB 4%

5 1.3 Million 96.3 Million 10.9 GB 7%

10 2.7 Million 192.5 Million 22.0 GB 11%

15 4.1 Million 289.7 Million 33.0 GB 16%

20 5.6 Million 385.2 Million 43.5 GB 20%

25 6.9 Million 482.0 Million 54.5 GB 26%

50 14.0 Million 963.1 Million 109.0 GB 31%

100 27.6 Million 1.9 Billion 220 GB 22%

150 41.5 Million 2.8 Billion 330 GB 33%

200 55.0 Million 3.8 Billion 440 GB 45%

0

20

40

60

80

100

1 5 10 15 20 25 30 35 40 45 50

Avg

. T

hro

ug

hp

ut

(K

even

ts/s

eco

nd

)

Number of expressways

0

100

200

300

400

50 100 150 200

Avg

. T

hro

ug

hp

ut

(K

even

ts/s

eco

nd

)Number of expressways

Page 23: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation23

Python History

Conceived in late 80’s

Implementation began by December 1989

Multi-paradigm programming language: object-oriented programming and

structured programming are fully supported

Dynamic typing and late binding

Core philosophy

Beautiful is better than ugly

Explicit is better than implicit

Simple is better than complex

Complex is better than complicated

Readability counts

Guido van Rossum,

the creator of Python

Page 24: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation24

Explore Python

Indentation matters

Variables

Numbers (integers and floats), Strings, Lists, Tuples, Dictionaries

Functions

Looping:

for <iteration variable> in <list>:

• <block of statements>

Conditional execution:

if <condition>:

• <block of statements>

Classes

Page 25: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation25

Explore Python

As of August 2016, the Python Package Index, the official repository of

third-party software for Python, contains over 86,000 packages offering a

wide range of functionality, including:

graphical user interfaces, web frameworks, multimedia, databases,

networking and communications

test frameworks, automation and web scraping, documentation tools,

system administration

scientific computing, text processing, image processing

Notebooks

Page 26: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation26

Streams & Python together: 2 capabilities

For the Python developer: Code all in Python, call Streams toolkits, run on Streams

Hello World:

import mymodule;

from streamsx.topology.topology import *

import streamsx.topology.context

topo = Topology("HelloWorld")

hw = topo.source(mymodule.hello_world)

hw.sink(print)

streamsx.topology.context.submit("STANDALONE", topo.graph)

For more, visit:

http://ibmstreams.github.io/streamsx.topology/doc/spldoc/html/tk$com.ibm.stream

sx.topology/ns$com.ibm.streamsx.topology.python$1.html

import mymodule; from streamsx.topology.topology import * import streamsx.topology.context topo = Topology("HelloWorld") hw = topo.source(mymodule.hello_world) hw.sink(print) streamsx.topology.context.submit("STANDALONE",

Page 27: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation27

Streams & Python together: 2 capabilities

For the SPL developer: Decorate Python functions inline in SPL

# Import the SPL decorators from streamsx.spl import spl

# Defines the SPL namespace for any functions in this module

# Multiple modules can map to the same namespace

def splNamespace():

return "com.ibm.streamsx.topology.pysamples.mail"

@spl.pipe

def SimpleFilter(a,b):

"Filter tuples only allowing output if the first attribute is less than

the second. Returns the sum of the first two attributes."

if (a < b):

return a+b, For more, visit:

http://ibmstreams.github.io/streamsx.topology/doc/spldoc/html/tk$com.ibm.stream

sx.topology/ns$com.ibm.streamsx.topology.python$6.html

Page 28: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation28

Steps to try it out

1. Download Streams Quick Start Edition: ibm.com/streams

2. Clone the streamsx.topology project: github.com/IBMStreams/streamsx.topology

1. first clone, then hit 'clone or download' to download to your machine

3. Extract to streamsx.topology in streamsadmin of Streams Quick Start Edition

4. cd to streamsx.topology directory and type 'ant‘

5. cd to com.ibm.streamsx.topology/opt/python/packages

1. Note the current directory path and type

2. 'export PYTHONPATH=$PYTHONPATH:<directory path>'.

3. export

PYTHONPATH=$PYTHONPATH:/home/streamsadmin/com.ibm.streamsx.topology/com.

ibm.streamsx.topology/opt/python/packages

4. Also, add this to your .bashrc profile

gedit ~/.bashrc

then add export statement to last line of file and save

Page 29: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation29

Steps to try it out (continued)

6. Download Anaconda for Jupyter notebook: continuum.io/downloads

7. Install Anaconda

8. In Streams Quick Start Edition, open Streams Domain Manager

6. Ensure the domain is running!

9. In Streams Quick Start Edition Domain Manager, start the Streams Console

10. In Streams Console, ensure Instance is running

11. In Streams Quick Start Edition terminal window, type

• 'pip install git+https://github.com/pybrain/pybrain.git'

• This is a dependency for the NetDemo demo

12. Download and extract demos.zip into your streamsadmin directory

13. In the terminal, cd to the demos folder

14. Type 'jupyter notebook'

15. A browser should pop up. In it, change to the NetDemo directory

Page 30: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation30

Steps to try it out (continued)

16. Click on the NetDemo.ipynb

17. The NetDemo demo does three things:

a) Creates a dataset (engine temp vs. probability of failure). This is the first line.

b) Creates a model to predict a probability of failure given an engine temp.

c) Creates a streaming application using the Python API which uses the model.

18. Put cursor in first cell, click on ‘run cell’

19. Repeat with next cell to build model

20. Repeat with last cell to run on Streams

1. This complies Python to SPL

2. Then compiles SPL to C/C++

3. Then creates .sab executiable

4. Then deploys to the Streams runtime

5. Returning code to the Jupyter notebook

21. When in doubt, go to kernel -> restart and clear output.

Run cell

Page 31: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation31

Some observations

I’m not a programmer – that was probably obvious!

Even when I was, it was procedural, not OO, so I found Python confusing

Some confusing terminology

‘tuple’ used in both Streams and Python

SPL Map == Python Dict {}

Python ideosyncracies – indents, parentheses, square and curly brackets

Page 32: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation32

Additional resources

Visit:

ibm.com/streams

github.com/Walmart

github.com/IBMStreams/benchmarks

Page 33: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter
Page 34: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation34

Legal Disclaimer

• © IBM Corporation 2015. All Rights Reserved.

• The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this publication, it is

provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are subject to change by IBM without notice. IBM shall

not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing contained in this publication is intended to, nor shall have the effect of,

creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.

• References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in this

presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way.

Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.

• If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete:

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending

upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no

assurance can be given that an individual user will achieve results similar to those stated here.

• If the text includes any customer examples, please confirm we have prior written approval from such customer and insert the following language; otherwise delete:

All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance

characteristics may vary by customer.

• Please review text for proper trademark attribution of IBM products. At first use, each product name must be the full name and include appropriate trademark symbols (e.g., IBM Lotus® Sametime® Unyte™).

Subsequent references can drop “IBM” but should include the proper branding (e.g., Lotus Sametime Gateway, or WebSphere Application Server). Please refer to http://www.ibm.com/legal/copytrade.shtml for

guidance on which trademarks require the ® or ™ symbol. Do not use abbreviations for IBM product names in your presentation. All product names must be used as adjectives rather than nouns. Please list all

of the trademarks that you use in your presentation as follows; delete any not included in your presentation. IBM, the IBM logo, Lotus, Lotus Notes, Notes, Domino, Quickr, Sametime, WebSphere, UC2,

PartnerWorld and Lotusphere are trademarks of International Business Machines Corporation in the United States, other countries, or both. Unyte is a trademark of WebDialogs, Inc., in the United States, other

countries, or both.

• If you reference Adobe® in the text, please mark the first use and include the following; otherwise delete:

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.

• If you reference Java™ in the text, please mark the first use and include the following; otherwise delete:

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

• If you reference Microsoft® and/or Windows® in the text, please mark the first use and include the following, as applicable; otherwise delete:

Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.

• If you reference Intel® and/or any of the following Intel products in the text, please mark the first use and include those that you use as follows; otherwise delete:

Intel, Intel Centrino, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

• If you reference UNIX® in the text, please mark the first use and include the following; otherwise delete:

UNIX is a registered trademark of The Open Group in the United States and other countries.

• If you reference Linux® in your presentation, please mark the first use and include the following; otherwise delete:

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.

• If the text/graphics include screenshots, no actual IBM employee names may be used (even your own), if your screenshots include fictitious company names (e.g., Renovations, Zeta Bank, Acme) please update

and insert the following; otherwise delete: All references to [insert fictitious company name] refer to a fictitious company and are used for illustration purposes only.

Page 35: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation35

Realtime ECG Monitoring with Python and Streams

Real-time analytics using Python and IBM Streams

Demo consists of two applications:

PhysionetIngestService – ingest ECG data from physionet.org – Data is published using Publish operator for

downstream Analytics

ECGPatientDataViz

• Application written in Python

• Ingest data from Physionet -> R Peak Detection using Biosppy -> Print

• Sets up two views – one for visualizing raw ECG data, one for R-Peak detection

Page 36: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation36

Realtime ECG Monitoring with Python and Streams

Python Application in Jupyter Notebook

Real-time ECG visualization

Demonstrates how we can integrate with Python Visualization Library using View

Python Bokeh Visualization Library (http://bokeh.pydata.org/en/latest/)

Page 37: Streaming overview and contenders Programming Languages ...files.meetup.com/9505222/StreamsPythonMeetup.pdf · 2 © 2015 IBM Corporation Why are we here? Pizza? Reese’s Peanut butter

© 2015 IBM Corporation37

Realtime ECG Monitoring with Python and Streams

Real-time R-Peak Detection in ECG Data

Real-time Poincaré plot to shows Heart Rate Variability – the more variability, the healthier the heart is.

Demonstrates how to use existing Python analytics library in real-time analytics (http://biosppy.readthedocs.io/en/stable/#)

Info about Poincare Plot (https://en.wikipedia.org/wiki/Poincaré_plot)