big data hadoop apex app for device to mobile, gps tracking with datatorrent

Post on 21-Jan-2018

202 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Big Data Hadoop Apex App for device to mobile, GPS tracking @DataTorrent

Venkatesh Kottapalli. Software Engineer

Vikram Patil. Software Engineer

Agenda:

● Introduction to Apex

● Use cases for GPS Tracking

● General requirements for GPS Tracking App

● Application Architecture using Apache Apex

● Further App Details

● Resources

Apache Apex - Stream Processing

Easily Operable - Exposes an easy API for developing Operators (part of an

application) and Applications

Highly Scalable - Scales statically as well as dynamically

Highly Performant - Can reach single digit millisecond end-to-end latency

Fault Tolerant - Automatically recovers from failures - without manual

intervention

Stateful - Guarantees that no state will be lost

Apex Malhar library

YARN - Native - Uses Hadoop YARN framework for resource negotiation

Apache Apex Platform Overview

An Apex Application is a DAG(Directed Acyclic Graph)

A DAG is composed of vertices (Operators) and edges (Streams).

A Stream is a sequence of data tuples which connects operators at end-points called Ports

An Operator takes one or more input streams, performs computations & emits one or more output

streams

● Each operator is USER’s business logic, or built-in operator from our open source

library

● Operator may have multiple instances that run in parallel

Apex - Native Hadoop Integration

• YARN is the resource manager

• HDFS used for storing any persistent state

Usecases:

● Track fleet vehicles while they are in transit for path safety or any kind of

frauds.

● Bus tracking for Government / Private Transportations to adjust routes

dynamically according to traffic situations.

● Track wild animals using gps enabled collars or devices

● Track inventory of items including cars, refrigerators, expensive retail goods

etc.

● Location based transportation apps. Ex - Uber, Lyft

● Location based gaming apps. Ex - Pokemon go

● Location based utility apps. Ex - Find my friends

General Requirements:

● Accept data from millions of devices through Tcp sockets

or over MQTT protocol.

● Once data is ingested, it need to be processed in realtime

to identify trends or events.

● Based on event priority, customer need to be informed

about it as well historical data need to be stored for

analysis or further review.

Overall Application Architecture:

App Details:

● Http Rest API support

● Websocket Support for clients to receive real-time

updates from App.

● Receive device data from millions of devices using tcp

socket at configured time interval.

[ Device data = location and device identification + (

temperature / pressure / battery status etc ) ]

● Device data parsing + processing to make it actionable in

real-time.

GPS Data Processing App

Websocket App

Http Server App

Communication between apps● Any config updates by the end user will be received by the http load receiver and

published onto a kafka topic which is then consumed by the GPS tracking app and the configuration is updated in memory in real time

Data Persistence

● Cassandra Output Operator● Cassandra Input Operator● Event Archival

Resources●http://apex.apache.org/

●Learn more: http://apex.apache.org/docs.html

●Subscribe - http://apex.apache.org/community.html

●Download - http://apex.apache.org/downloads.html

●Follow @ApacheApex - https://twitter.com/apacheapex

●Meetups – http://www.meetup.com/pro/apacheapex/

●More examples: https://github.com/DataTorrent/examples

●Slideshare: http://www.slideshare.net/ApacheApex/presentations

●https://www.youtube.com/results?search_query=apache+apex

●Free Enterprise License for Startups -

https://www.datatorrent.com/product/startup-accelerator/

Q&A

Thank you

top related