parkmate - real time parking spot recommender

9
ParkMate - Real-time parking spot recommender Suhas, Insight Data Engineering - Sep 2015

Upload: suhashm

Post on 12-Feb-2017

929 views

Category:

Data & Analytics


0 download

TRANSCRIPT

ParkMate - Real-time parking spot recommenderSuhas, Insight Data Engineering - Sep 2015

Motivation

Making the task of finding a parking spot smooth and easy using real time parking sensor data

SF Parking Data

Total of 952 Parking Spots (15 garages and 937 street parking) data are ingested every 2 seconds.

Data throughput ~15 GB/day

Can be extended to handle huge loads for multiple cities

Cluster Setup- ec2 - m4.large - 4 instances

● Hadoop - 1 Namenode and 3 Worker nodes

● Spark - 1 Master and 3 Slaves

● Kafka - 4 brokers

● Cassandra - 4 Nodes

● Elasticsearch - 4 Nodes

● Zookeeper - 4 Nodes

Pipeline

SF ParkFirebase

Real time Ingestion

Storage for Batch Batch Processing

Stream Processing

Time series aggregate

Analytics Dashboard

Geo-Spatial Query

User GPS

Challenges

● Spark partitioning RDD for distributed computing.

● Writing data from Kafka to HDFS - Camus vs. custom script.

● Elasticsearch partial document update.

● Spark to Cassandra - PySpark-Cassandra driver.

About meSuhas - CS Grad @UIUC

www.thinkjs.ioBackground:

Full Stack Web Development

Passionate to learn Big-data technologies.

Future plan is to Contribute to Open Source.

Hobbies: Long drives

Thank You