mongodb, event sourcing & spark

115
Architecting the Internet of Things @blimpyacht

Upload: bryan-reinero

Post on 11-Feb-2017

601 views

Category:

Software


1 download

TRANSCRIPT

Architecting theInternet of Things@blimpyacht

5

Event Sourcing

Event SourcingTHE BEST, UBIQUITOUS PATTERN

YOU’VE NEVER HEARD OF

User: ABalance: 427

User: BBalance: 550

User: ABalance: 427

User: BBalance: 550

User: ABalance: 427

User: BBalance: 550

+5-5

Event Sourcing

State

+4-2+5+7-4+9-3-6+2+1+1-5 t(12) = 9

Event Sourcing

+4-2+5+7-4+9-3-6+2+1+1-5

t(5) = 10

Event Sourcing

+4-2+5+7-4+9-3-6+2+1+1-5

t(9) = 12

Event Sourcing

+4-2+5+7-4+9-3-6+2+1+1-5

Event Sourcing

a = f ’(x)

+4-2+5+7-4+9-3-6+2+1+1-5

Event Sourcing

+5-4+5

Res Gestae Divi Augusti

2ndaryPrimary

2ndaryPrimary

2ndaryPrimary

+4-2+5+7-4+9-3-6+2+1+1-5

Event Sourcing

+4-2+5+7-4+9-3-6+2+1+1-5

t(4) = 14

Event Sourcing

t(8) = 10

t(12) = 9

+4-2+5+7-4+9-3-6+2+1+1-5

t(4) = 14

Event Sourcing

t(8) = 10

t(12) = 9

t’(4) = -2

t’(8) = -17

t’(12) = -22

{ _id: {

ts: 1213084687000, taxi: "abboip" }, loc: {

geometry: [ -122.39488, 37.75134

] }, fare: 0, ts: ISODate("2008-06-10T07:58:07Z")}

{ _id: ObjectId("56a4215a46778b69e08ff8ff"), taxi: "abboip", start: ISODate("2008-05-17T14:51:10Z"), route: { type: "LineString", coordinates: [ [ -122.39724, 37.74977 ], [ -122.40619, 37.74896 ], [ -122.41335, 37.74831 ], [ -122.414, 37.75157 ], [ -122.41438, 37.75552 ]

] }, end: ISODate("2008-05-17T14:55:58Z")}

Data Modeling Domain EventsDomain Objects

Accounts Payable

Accounts Payable

Accounts Receivable

Accounts Payable

Accounts Receivable

Journal

Accounts Payable

Accounts Receivable

Journal

Ledger

Command QueryResponsibility

Segregation

CQRS

CRUD vs. CQRS

CRUD

CRUD vs. CQRS

WRITE

READ

CRUD vs. CQRS

CRUD equals CQRS

WRITE

READ

READ

CRUD vs. CQRS

WRITE

READ

Variant Read Models: VIEWS

READREADREADREAD

WRITE

READ

Primary

READ

CRUD vs. CQRS

CRUD vs. CQRS

WRITE

READ

READ

Primary

2ndary

2ndary

Where it Gets Tricky

CRUD vs. CQRS

WRITE

READ

READ

Primary

2ndary

2ndary

CRUD vs. CQRS

WRITE

READ

READ

Primary

2ndary

2ndary

Consistent Replay+4-2+5+7-4+9-3-6+2+1+1-5

t(5) = 10 Exchange Rate Server

10 / 0.88 = 8.82€

Consistent Replay+4-2+5+7-4+9-3-6+2+1+1-5

t(5) = 10 Exchange Rate Server

10 / 0.87 = 8.70€

+4-2+5+7-4

+9-3-6+2+1+1-5

0.88 Exchange Rate Server

Consistent Replay

Where it Gets Trickier Still

External Updates+4-2+5+7-4+9-3-6+2+1+1-5

Notification

Subscriber

External Updates+4-2+5+7-4+9-3-6+2+1+1-5

Notification

Subscriber

Gaining Insight

MongoDB + Spark

Level Setting

TROUGH OF DISILLUSIONMENT

Interactive ShellEasy (-er)Caching

HDFS

Distributed Data

HDFSYARN

Hive

Pig

Domain Specific Languages

MapReduce

Spark Stand Alone

YARN

Mesos

HDFS

Distributed Resources

YARN

SparkMesos

HDFS

Spark Stand Alone

Hadoop

Distributed Processing

YARN

SparkMesos

Hive

Pig

HDFS

Hadoop

Spark Stand Alone

Domain Specific Languages

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

HDFS

Spark Stand Alone

Hadoop

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

HDFS

Spark Stand Alone

Hadoop

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

HDFS

Spark Stand Alone

Hadoop

YARN

SparkMesos

Hive

Pig

SparkSQL

Spark Shell

SparkStreaming

Spark Stand Alone

Hadoop

Stand AloneYARN

SparkMesos

Hive

Pig

SparkSQL

SparkShell

SparkStreaming

MapReduce

Stand AloneYARN

SparkMesos

SparkSQL

SparkShell

SparkStreaming

Stand AloneYARN

SparkMesos

SparkSQL

SparkShell

SparkStreaming

executor

Worker Node

executor

Worker Node Master

Java Driver

Hadoop Connector

Driver Application

Parallelization

Parellelize = x

Transformations

Parellelize = x t(x) = x’ t(x’) = x’’

Transformationsfilter( func )union( func )intersection( set )distinct( n )map( function )

Action

f(x’’) = yParellelize = x t(x) = x’ t(x’) = x’’

Actionscollect()count()first()take( n )reduce( function )

Lineage

f(x’’) = yParellelize = x t(x) = x’ t(x’) = x’’

Transform Transform ActionParallelize

Lineage

Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize

Lineage

Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize

Lineage

Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize Transform Transform ActionParallelize

Lineagehttp://www.blimpyacht.com/2016/02/03/a-visual-guide-to-the-spark-hadoop-ecosystem/

https://github.com/mongodb/mongo-hadoop

Spark ConfigurationConfiguration conf = new Configuration();conf.set(

"mongo.job.input.format", "com.mongodb.hadoop.MongoInputFormat”);conf.set(

"mongo.input.uri", "mongodb://localhost:27017/db.collection”);

Spark ContextJavaPairRDD<Object, BSONObject> documents = context.newAPIHadoopRDD( conf,

MongoInputFormat.class,Object.class,BSONObject.class

);

Spark Submit

/usr/local/spark-1.5.1/bin/spark-submit \ --class com.mongodb.spark.examples.DataframeExample \ --master local Examples-1.0-SNAPSHOT.jar

Stand AloneYAR

N

SparkMesos

SparkSQL

SparkShell

SparkStreaming

JavaRDD<Message> messages = documents.map (

new Function<Tuple2<Object, BSONObject>, Message>() {

public Message call(Tuple2<Object, BSONObject> tuple) { BSONObject header = (BSONObject)tuple._2.get("headers");

Message m = new Message(); m.setTo( (String) header.get("To") ); m.setX_From( (String) header.get("From") ); m.setMessage_ID( (String) header.get( "Message-ID" ) ); m.setBody( (String) tuple._2.get( "body" ) );

return m; } });

THE FUTUREAND

BEYOND THE INFINITE

Spark Connector

Aggregation Filters$match | $project | $group

Data Locality mongos

THANKS!@blimpyacht