Transcript
Douglas ButlerProduct Manager
massively parallel, lock free, FASTdistributed SQL database
in-memory, on-diskACID
JSON and geospatialtransactions and analytics
2 Minute Install
A Simple Pipeline
from pystreamliner.api import Extractor
class CustomExtractor(Extractor): def initialize(self, streaming_context, sql_context, config, interval, logger): logger.info("Initialized Extractor") def next(self, streaming_context, time, sql_context, config, interval, logger): rdd = streaming_context._sc.parallelize([[x] for x in range(10)]) return sql_context.createDataFrame(rdd, ["number"])
> memsql-ops pip install [package]
distributed cluster-wide
any Python package
bring your own
Real-time pipeline
Q & A time