real-time big data at fpt (for techcamp university)

Post on 23-Jan-2018

782 Views

Category:

Data & Analytics

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Real-time Big Data at FPT

and some key ideas to build real-time big data platform from open source tools○ Apache Spark○ Reactive Function X (RFX)

Presented by @tantrieuf31http://nguyentantrieu.info

about me ?● Full Stack Engineer and Tech Lead at AdsPlay,

startup project from FPT Telecom● Founder at RFXLab.com, building RFX

framework and Fast Data Intelligence Platform for Data-driven Organization

● Tech Blogger at http://engineering.adsplay.net

Abstract

1. Just 5 minutes about the history of “Big Data”2. Does Big Data solve big problems ?3. Overview about Open Source Tools

a. Netty (Event Collector)b. Kafka (Event Queue)c. RFX-Stream (Event Processor)d. Apache Spark (Big Data processing engine)e. RFX-Iris (Fast Data Query Interface)

5 minutes about the history of “Big Data”

Imagine what if you have to build a GREAT pyramid ?

In fact, the Big Data was born in 3000 years ago. When you have to build a great thing, you would face with making decisions with lots of data.

How ?Decisions without Data ?

OK, let’s get back to 2015

What if the business is not driven by data?Refer: http://www.nytimes.com/2011/04/24/business/24unboxed.html

Since 2015, the Fast Data, a new trend, has been replacing Big Data

http://www.tibco.com/blog/2015/03/27/how-analytics-facilitates-fast-data

1970s 1990s 2000s 2010s

Data Management Technology and Trends● Netty.io● Apache Storm● Apache Kafka● Apache Spark● RFX● ...

● Hadoop Ecosystem● NoSQL Ecosystem● ...

● Oracle● MySQL● PostgreSQL● ...

“Does Big Data solve our big problems ?

tracking all access logs and user’s activities

Processing in real-time( seconds) !

Storing multiple types of log (video, web, mobile, like, comment, play, … )

http://www.rfxlab.com

boosting Sale Revenue / Profit

Log events

Reactive events

How is the Big Data used at FPT ?

Does Vietnamese love football ? The correlation said YES

Analyzing trending events in real-time !

Visualizing all user’s devices

Real-time Big Data Architecture

“How to build an “Just-Work” real-

time big data system ?

KEY IDEA is “divide and conquer”

User Story in plain English

1. Hercules is thinking about some questions. E.g: What’s hot songs of Nhacso on Facebook ?

2. He decides to ask Iris about this question.3. Iris analyzes the question into “query

messages” and deliver them to Zeus.4. Zeus uses his power of “large-scale data

processing” to answer the question.5. Done, Zeus return the result “hot songs on

Facebook” for Iris. 6. She sends the result to Hercules

Visualizing our user storyQuestion about Big Data: What’s hot songs of NhacSo.net on Facebook ?

messages

ZeusIrisHercules

Let’s see how it works

Awesome Open Source Projects to follow

RFXLab.com◎ http://www.rfxlab.com ◎ https://github.com/rfxlab

Kafka : http://kafka.apache.org Hadoop http://hadoop.apache.org Apache Spark https://spark.apache.org

Awesome Open Source Projects to follow

Native Kafka driver: https://github.com/edenhill/librdkafka/

PHP Kafka driver: https://github.com/EVODelavega/phpkafka

Data Visualization JavaScript Libraryhttps://github.com/nvd3-community/nvd3

Good ref books

"Spend some time alone and learn to develop your personal resources."

Alexander Reid Martin

More info at http://engineering.adsplay.net/jobs-at-adsplay-team

top related