real-time dashboards for apache apex (next gen hadoop) apps

14
Real-time Dashboards for Apache Apex Apps Sasha Parfenov [email protected] Jan 30, 2017

Upload: datatorrent

Post on 08-Feb-2017

490 views

Category:

Technology


5 download

TRANSCRIPT

Real-time Dashboards for Apache Apex Apps

Sasha [email protected] 30, 2017

Agenda

� Apache Apex Core & Malhar

� DataTorrent RTS Platform

� Apache Apex Applications

� DataTorrent RTS Console Demo

� Real-time Dashboards and Widgets

� Exporting and Packaging Dashboards

� Q&A

2

3

Apache Apex Core

✓ Platform and Runtime Engine - enables development of scalable and

fault-tolerant distributed applications for processing streaming and batch

data

✓ Highly Scalable - Scales linearly to billions of events per second with

statically defined or dynamic partitioning

✓ Highly Performant - Can reach single digit millisecond end-to-end latency

✓ Fault Tolerant - Automatically recovers from failures - without manual

intervention

✓ Stateful - Guarantees that no state will be lost

✓ YARN Native - Uses Hadoop YARN framework for resource negotiation

✓ Developer Friendly - Exposes an easy API for developing Operators, which

can include any custom business logic written in Java, and provides a Malhar

library of many popular operators and application examples

apex.apache.org

4

Malhar Operator Library

5

DataTorrent Platform

Solutions for Business

Ingestion & Data Prep ETL Pipelines

Ease of Use Tools Real-Time Data VisualizationManagement & MonitoringGUI Application Assembly

Application Templates

Apex-Malhar Operator Library

Big Data Infrastructure Hadoop 2.x – YARN + HDFS – On Prem & Cloud

Core

High-level APITransformation ML & Score SQL Analytics

FileSync

Dev Framework Batch Support

Apache Apex Core

Kafka-to-HDFS JDBC-to-HDFS HDFS-to-HDFS S3-to-HDFS

6

Apex + RTS Use Cases

Data Sources

Op1

Hadoop (YARN + HDFS)

Real-time Analytics &

Visualizations

Op3

Op2

Op4

Streaming Computation Actions & Insights

Data Targets

7

Apex Application Development

Stream is a sequence of data tuples

Operator takes one or more input streams, performs computations & emits one or more output streams

● Each Operator is YOUR custom business logic in java, or built-in operator from our open source library

● Operator has many instances that run in parallel and each instance is single-threaded

Application DAG is made up of connected operators and streams

8

Apex Application Instances

Apex application running on a Hadoop cluster

YARN for RMHDFS for state store

Partitioning and Logical vs Physical DAGs

0 1 2

1

1 Unifier

1

20

Logical DAG

Physical DAG with operator 1 with 3 partitions

9

DataTorrent RTS Console Demo● AppHub

○ Demos○ App Templates

● Develop○ Manage App Packages○ Configure & Launch Apps○ Graphical Application Design

● Monitor○ Stats & Metrics○ Events & Logs○ Visualize DAGs○ Record Tuples

● Visualize○ Dashboards○ Widgets

● Configure○ System Settings○ Security & User Management○ Alerts

10

DataTorrent RTS Console Demo Video

3. Create ui.json in Apex app project folder under

<Apex App>/src/main/resources/resources/ui/ui.json

{ "dashboards": [ { "file": "TwitterDemo.dtdashboard" }, { "name": "Sales Dimensions Demo", "file": "SalesDemo.dtdashboard", "appNames": ["SalesDemo-Sasha", "SalesDemo"] } ]}

// "appNames" is used to auto-associate packaged dashboards with running apps

4. Compile Apex app project and verify .apa package has

myApp.apa

+ resources/

+ ui/

- ui.json

+ dashboards/

- TwitterDemo.dtdashboard

- SalesDemo.dtdashboard

11

Exporting and Packaging Dashboards1. Create and download dashboard from UI Console

2. Copy dashboards to Apex app project folder under

<Apex App>/src/main/resources/resources/ui/dashboards/

- TwitterDemo.dtdashboard

- SalesDemo.dtdashboard

Resources• Apache Apex - http://apex.apache.org/

• Subscribe to forums

ᵒ Apex - http://apex.apache.org/community.html

ᵒ DataTorrent - https://groups.google.com/forum/#!forum/dt-users

• Download - https://datatorrent.com/download/

• Twitterᵒ @ApacheApex; Follow - https://twitter.com/apacheapexᵒ @DataTorrent; Follow – https://twitter.com/datatorrent

• Meetups - http://meetup.com/topics/apache-apex

• Webinars - https://datatorrent.com/webinars/

• Videos - https://youtube.com/user/DataTorrent

• Slides - http://slideshare.net/DataTorrent/presentations

• Startup Accelerator Program - Full featured enterprise productᵒ https://datatorrent.com/product/startup-accelerator/

• Big Data Application Templates Hub – https://datatorrent.com/apphub

12

We Are Hiring!• [email protected]

• Developers/Architects

• QA Automation Developers

• Information Developers

• Build and Release

• Community Leaders

13

www.ApexBigData.com14