stream ingestion, processing and analytics using in-memory ... · stream ingestion, processing and...

17
© 2018 GridGain Systems, Inc. GridGain Company Confidential Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair GridGain Product Management

Upload: others

Post on 17-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Stream Ingestion, Processing and Analytics Using In-Memory Computing

Denis MagdaApache Ignite PMC ChairGridGain Product Management

Page 2: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Agenda• Streaming Analytics

• Ignite Native Streaming APIs

• Ignite Streamers Ecosystem

• Ignite and Spark: Better Together

• GridGain Kafka Connector Integration

• Q & A

Page 3: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Streaming Analytics

Page 4: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Page 5: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite Native Streaming APIs

Page 6: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Apache Ignite

Memory-Centric StorageScale to 1000s of Nodes & Store TBs of Data

Ignite Native Persistence(Flash, SSD, Intel 3D XPoint)

Third-Party PersistenceKeep Your Own DB

(RDBMS, HDFS, NoSQL)

SQL Transactions Compute Services MLStreamingKey/Value

IoTFinancialServices

Pharma &Healthcare

E-CommerceTravel & LogisticsTelco

Page 7: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite Streaming APIs

• Ignite Data Streamer– Partitioning of streams of data– Ignite streaming powerhouse

• Stream Receivers and Transformers– Last-call data updates and analysis

• Continuous Queries– Data updates notifications and post-

processing

Page 8: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite Streamers Ecosystem

Page 9: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite Streamers Ecosystem

• Various Streaming Technologies– Kafka, Spark, Flink, Storm, etc.– Process, Enrich and push to Ignite

• Ignite as a final store for streaming data– Streaming Analytics

Data Node

Data Node

Data Node

Page 10: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite and Spark: Better Together

Page 11: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

• Distributed memory-centric database • Ingests data from HDFS or another storage

• Fully fledged compute platform: SQL, transactions, key-value, collocated processing, ML/DL

• Streaming and compute engine

• OLAP and OLTP • Inclined towards OLAP and focused on MR payloads

Comparing Ignite and Spark

Page 12: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Ignite is a memory-centric store for Spark

• No data movement from Ignite to Spark

• In-place query execution

• Boost DataFrame and SQL performance

• Share state and data among Spark jobs

• Faster data and streaming analytics

+

Ignite and Spark Together

Page 13: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Spark Application

Spark Worker

Spark Job

Spark Job

Yarn Mesos Docker HDFS

Spark Worker

Spark Job

Spark Job

Spark Worker

Spark Job

Spark Job

GridGain Node GridGain Node GridGain Node

Share state and data among

Spark jobs

No data movement

Boost DataFrame and SQL Performance

SQL on top of RDDs

In-place query execution

Ignite and Spark Integration

Page 14: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

GridGain Kafka Connector Integration

Page 15: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

GridGain Confluent Integration

Page 16: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

GridGain Connector Advantages over Ignite Kafka Integration

• Advanced parallelism• Exactly Once processing semantics• Single connector per multiple

caches/topics• Filtering of source and sink connectors

• Enterprise Ready– Supported by GridGain, certified by

Confluent.

Page 17: Stream Ingestion, Processing and Analytics Using In-Memory ... · Stream Ingestion, Processing and Analytics Using In-Memory Computing Denis Magda Apache Ignite PMC Chair ... –

©2018 GridGain Systems,Inc. GridGainCompanyConfidential

Thank you for joining us. Follow the conversation.https://ignite.apache.org

Thank You!!!

@denismagda#apacheignite#gridgain