business implications of kafka to hdfs app
TRANSCRIPT
Agenda
●Application Template: Kafka => HDFS●Brief on Kafka●Use Case 1: Real Time Ad Performance●Use Case 2: Financial Data Fabrication●Use Case 3: Real Time Sensor Data●Why DataTorrent RTS?●Questions?
2
Application Template: KAFKA => HDFS3
● Reading from Kafka Messaging Queue
● Writing to HDFS
KAFKA HDFS
Brief about Kafka4
● Distributed Messaging System
● Fast Reads and Writes
● Can handle large number of clients
● Scalable, fault-tolerant, partitionable
● Persistent messages
Brief about Kafka (contd.)5
● Terminologies○ Topic○ Producer○ Consumer○ Broker
Use Case1: Real Time Ad Performance6
Ad Servers (AWS – Region 2)
Ad Servers (AWS – Region 1)
Real Time Dashboarding
Ad Placement
Persistent
In-Memory Computation
Kafka
Producers
Kafka
Brokers
Ad Servers (AWS – Region n)
Ad server log events consumed from Kafka. Real Time Dimension
Computation. Frontend integration through Kafka based
query protocol for real time dashboardcomponents
Use Case 2: Financial Data Fabrication7
Financial Data
SMTP Logs
Historical
Application n
Application 1
Persistent
Encrypt Compliance Alert on error
Archive
Kafka
Producers
Kafka
Brokers
Secure, fault tolerant, data ingestion, formatting & archiving. Data access
layer for application processing
Use Case 3: Real Time Sensor Data8
Sensor 2
Sensor 1
Sensor N
Application n
Application 1
Persistent
Data Governance
Complex Event Process
Predictive Maintenance
Kafka
Producers
Kafka
Brokers
High performance, multi-customer secure, data ingestion. Complex
event processing with historical data for predictive maintenance
Why DataTorrent RTS?9
● Powered by Apache Apex
● In-memory Processing
● Reusable Malhar components
● Built in Fault-tolerance, Scalability
● Ease of development
● Reduced Time to Production
10
Resources
11
• Apache Apex - http://apex.apache.org/• Subscribe - http://apex.apache.org/community.html• Download - https://www.datatorrent.com/download/• Twitter
ᵒ @ApacheApex; Follow - https://twitter.com/apacheapexᵒ @DataTorrent; Follow – https://twitter.com/datatorrent
• Meetups - http://www.meetup.com/topics/apache-apex• Webinars - https://www.datatorrent.com/webinars/• Videos - https://www.youtube.com/user/DataTorrent• Slides - http://www.slideshare.net/DataTorrent/presentations • Startup Accelerator Program - Full featured enterprise product
ᵒ https://www.datatorrent.com/product/startup-accelerator/
We Are Hiring
12
• [email protected]• Developers/Architects• QA Automation Developers• Information Developers• Build and Release• Community Leaders
Upcoming events...
Apache Apex Meetup
•Wednesday, November 9, 2016 at 7:30pm IST – Deep Dive of Kafka to HDFS/Hadoop Ingestion App Template