nosql couchbase lite & bigdata hpcc systems

43
Mobile Data with Couchbase Lite & Big Data HPCC Systems By Fujio Turner

Upload: fujio-turner

Post on 28-Jan-2018

572 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: NoSQL Couchbase Lite & BigData HPCC Systems

Mobile Data with Couchbase Lite !&!

Big Data HPCC SystemsBy Fujio Turner

Page 2: NoSQL Couchbase Lite & BigData HPCC Systems

What is Couchbase Lite ?

Page 3: NoSQL Couchbase Lite & BigData HPCC Systems

What is Couchbase Lite ?

NoSQL JSON Document Database for Mobile

Page 4: NoSQL Couchbase Lite & BigData HPCC Systems

+Your Code

Embedded Database

Couchbase Lite 0.5 MB

Page 5: NoSQL Couchbase Lite & BigData HPCC Systems

Why do I need Couchbase Lite ?

Page 6: NoSQL Couchbase Lite & BigData HPCC Systems

Why do I need Couchbase Lite ?Mobile Myths:

1. Always Available 2. Always High Performing

The mobile network is:

Page 7: NoSQL Couchbase Lite & BigData HPCC Systems

How Couchbase Lite tackles the Mobile MythsLocal data is always faster

Page 8: NoSQL Couchbase Lite & BigData HPCC Systems

How Couchbase Lite tackles the Mobile MythsLocal data is always fasterI need to save the data non-locally

,but

Page 9: NoSQL Couchbase Lite & BigData HPCC Systems

How Couchbase Lite tackles the Mobile MythsLocal data is always fasterI need to save the data non-locally

I need to send data to another mobile devices

,but

and/or

Page 10: NoSQL Couchbase Lite & BigData HPCC Systems

EZ Data Syncing with !Couchbase Sync Gateway

https://github.com/couchbase/sync_gateway

Page 11: NoSQL Couchbase Lite & BigData HPCC Systems

Channels

{“data”:”yes”}• Authentication & Sessions • Definable channel rules via JavaScript

http(s):// REST server

How Sync Gateway Works

Written in:

Data Flow:

CRUD:

Page 12: NoSQL Couchbase Lite & BigData HPCC Systems

Who is using Couchbase Lite ?

Page 13: NoSQL Couchbase Lite & BigData HPCC Systems

HowUses Couchbase Litehttps://youtu.be/tYolHnbCavA

Page 14: NoSQL Couchbase Lite & BigData HPCC Systems

What BigData solution is ready for the next

20 plus years ?

Page 15: NoSQL Couchbase Lite & BigData HPCC Systems

LexisNexis is a provider of legal, tax, regulatory, news, business information, and analysis to legal, corporate, government,!

accounting and academic markets. !!

!

!

LexisNexis has been in business since 1977 with over 30,000 employees worldwide. 

What is HPCC Systems?Who is ?

LexisNexis Risk is the division of the LexisNexis which focuses on data, Big Data processing, linking and vertical expertise and supports HPCC Systems as an open source project under Apache 2.0 License.

Page 16: NoSQL Couchbase Lite & BigData HPCC Systems

Comparison

JAVA C++

Petabytes

1-80,000 Jobs/day

Since 2005

Exabytes

Since 2000

Indexed: 2K-3K Jobs/sec*

? ? ? ? ? ?

Thor Roxie

Block Based File Based

In-Memory: 30 - 40 Jobs/min*

Non-Indexed: 4-1,040,000 Jobs/day

 *based on job (size / result set / complexity)

Page 17: NoSQL Couchbase Lite & BigData HPCC Systems

“I’m sub-second fast.”

“I can query all or part of your

data.”

Thor RoxieSingle Threaded

Hard Disk Index(optional)

Multi-Threaded Hard Disk

Index(optional) In-memory

SSD

Either/Both

Architecture

Page 18: NoSQL Couchbase Lite & BigData HPCC Systems

BusinessDevelopmentCustomers1 20

Non-Indexed Full Data Set

http://hpccsystems.com/why-hpcc/benchmarks

Page 19: NoSQL Couchbase Lite & BigData HPCC Systems

300GB File

Kevin CA 45 Mark MI 27 Sara FL 64

Name State Age

How is Data Stored on !HPCC Systems ?!

Example

Customer Data May 2010

Page 20: NoSQL Couchbase Lite & BigData HPCC Systems

K.. CA 45 M.. MI 27 S.. FL 64

Thor Master

Thor Slaves

Kevin CA 45 Mark MI 27 Sara FL 64

Store Data

File Name ~/customers_2010-05

Data is distributed evenly in the cluster with replica copies and is seen as a file (example below).

Page 21: NoSQL Couchbase Lite & BigData HPCC Systems

K.. CA 45 M.. MI 27 S.. FL 64

Thor Master

Thor Slaves

Kevin CA 45 Mark MI 27 Sara FL 64

Store Data

Dali

File Location & Job Scheduler

File locations are stored on disk.

File Name ~/customers_2010-05

Page 22: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Dali

What state do most people live in?

ESP

1a.

2.

File Location & Job Scheduler 1.a A pre-compiled query is triggered. (Mostly used in Roxie) 1b. Ad-hoc query. !2.Query is sent to Dali to get file locations.

1b.

Page 23: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Dali

What state do most people live in?

ESP3.

File Location & Job Scheduler3. Job is placed in que to be sent to Thor Master. Thor Master coordinates job execution on Thor Slave nodes.

Page 24: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Dali

What state do most people live in?

ESP

File Location & Job SchedulerJob are done locally on slaves and/or coordinated by master globally.

Page 25: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Dali

What state do most people live in?

ESP

4.

4.

MI 500 CA 120 FL 7

File Location & Job Scheduler

4.Job is returned with optional grouped by & sorted by at run time.

Page 26: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Dali

What state do most people live in?

ESP

MI 500 CA 120 FL 7

File Location & Job Scheduler

SORT!GROUP!DEDUP!JOIN!MERGE!BETWEEN!LENGTH!REGEX!ROUND!SUM!COUNT!TRIM!WHEN!AVE!CASE!NORMALIZE!DENORMALIZE!K-MEANS!more ….

Multiple other actions can be done on the data in a single job.

Page 27: NoSQL Couchbase Lite & BigData HPCC Systems

Sort

Count

Group

Classification

(ROXIE) 0.27 seconds to (THOR) few hours

Country = ‘US’

Join

Index of ~/facebook_2013

Query is Completed in a Single Job!Asynchronously

~/facebook_2013

Country = ‘US’

~/twitter_2013

optional

Page 28: NoSQL Couchbase Lite & BigData HPCC Systems

K CA 45 M MI 27 S FL 64Thor Master

Thor Slaves

Kevin CA 45 Mark MI 27 Sara FL 64

CA row #3 MI row #17 MI row #4 FL row #5

Speed - Part 1Indexing

Index Index Index

• index per file • customize by field(s)

File Name ~/customers_2010-05

File Name ~/customers_2010-05_index

Page 29: NoSQL Couchbase Lite & BigData HPCC Systems

1 40

Non-Indexed

1 200

To

Indexed

Page 30: NoSQL Couchbase Lite & BigData HPCC Systems

1 40

Non-Indexed

1 200

To

Indexed

male row #345 female row #4 male row #97 female row #267

CA row #3 MI row #17 MI row #4 FL row #5

Example Index Example Index

Page 31: NoSQL Couchbase Lite & BigData HPCC Systems

Speed - Part 2Roxie

K CA 45 M MI 27 S FL 64Roxie Master

Roxie Slaves

Index In-Memory

Index Index Index

Page 32: NoSQL Couchbase Lite & BigData HPCC Systems

Speed - Part 2Roxie

K CA 45 M MI 27 S FL 64Roxie Master

Roxie Slaves

Index In-Memory & Part or All Data

Index Index Index

orIndex In-Memory

Page 33: NoSQL Couchbase Lite & BigData HPCC Systems

Speed - Part 2Roxie

K CA 45 M MI 27 S FL 64Roxie Master

Roxie Slaves

Roxie is Multi-ThreadedIndex In-Memory & Part or All Data

orIndex In-Memory

Index Index Index

Page 34: NoSQL Couchbase Lite & BigData HPCC Systems

Speed - Part 2Roxie

K CA 45 M MI 27 S FL 64Roxie Master

Roxie Slaves

Roxie is Multi-ThreadedIndex In-Memory & Part or All Data

orIndex In-Memory

Index Index Index

SSD are OK - write few / read many

Page 35: NoSQL Couchbase Lite & BigData HPCC Systems

Speed - Part 2Roxie

K CA 45 M MI 27 S FL 64Roxie Master

Roxie Slaves

Roxie is Multi-ThreadedIndex In-Memory & Part or All Data

orIndex In-Memory

Index Index Index

2004

Page 36: NoSQL Couchbase Lite & BigData HPCC Systems

Thor Master

Thor Slaves

Dali ESP

Roxie Master

Roxie Slaves

Common Cluster

Data is a mix of structured and unstructured. Use Thor to do ETL and send results to Roxie for user queries.

Page 37: NoSQL Couchbase Lite & BigData HPCC Systems

HPCC Systems 5.2

New JSON file support

Page 38: NoSQL Couchbase Lite & BigData HPCC Systems

https://github.com/couchbase/sync_gateway/wiki/Webhooks

Flow Data !From: Sync Gateway !To: HPCC Systems

Page 39: NoSQL Couchbase Lite & BigData HPCC Systems

{“data”:”yes”}

Sync Gateway’s Webhooks API lets you catch every JSON coming into Sync Gateway

Page 40: NoSQL Couchbase Lite & BigData HPCC Systems

{“data”:”yes”} Couchbase Lite to !HPCC Systems !

Transport

A simple Python web server that can catch all the HTTP POST from Sync Gateway and writes it

to a file for HPCC Systems to store.

Page 41: NoSQL Couchbase Lite & BigData HPCC Systems

https://github.com/househippo

Couchbase Lite to HPCC Systems Transport

Page 42: NoSQL Couchbase Lite & BigData HPCC Systems

INSTALL!in 5 Minutes

Download

Source Code

Learning More - Couchbase Lite

http://couchbase.com/download

https://github.com/couchbase

Mountain View, CA San Francisco ,CA

http://developer.couchbase.com/mobile/get-started/get-started-

mobile/index.html

Page 43: NoSQL Couchbase Lite & BigData HPCC Systems

INSTALL!in 5 Minutes

Download

or

Source Codehttps://github.com/hpcc-systems

http://hpccsystems.com/download/

Learning More - HPCC Systems

Atlanta, GA Mountain View, CA

https://youtu.be/8SV43DCUqJg