modernizing with microservices and fast data

34
Presented by Patrick Di Loreto Head of Engineering Site: https://developer.williamhill.com/ BLOG: http://patricknoir.blogspot.com Twitter: https://twitter.com/patricknoir Modernizing with Microservices and Fast Data

Upload: patrick-di-loreto

Post on 18-Jan-2017

118 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Modernizing with microservices and fast data

Presented by Patrick Di LoretoHead of Engineering

Site: https://developer.williamhill.com/BLOG: http://patricknoir.blogspot.comTwitter: https://twitter.com/patricknoir

Modernizing with Microservices and Fast Data

Page 2: Modernizing with microservices and fast data

Big Data in Numbers

By the end of 2016 there will be more than:25,000,000,000 devices connected in internet

On 2013 we produced more data in 2 days than the whole human history since the origin

Page 3: Modernizing with microservices and fast data

What does it mean for us

- 160TB of Data are flowing through our system every day

- We push more than 5 millions price changes in real time

- On a busy day we have ½ million simultaneous customers on our platform

Page 4: Modernizing with microservices and fast data

The Challenge

Build a data platform suitable for the development of modern applications

Page 5: Modernizing with microservices and fast data

Requirements- Be able to process large amounts of data in a close real time fashion

- Respect non functional requirements such as:- FAULT TOLERANCE- HIGHLY AVAILABILITY- SCALABILITY

- Dealing with existing/legacy systems

- Scale team delivery capability through adoption of Microservices Architecture

Page 6: Modernizing with microservices and fast data

• Microservices are not exclusively STATELESS applications!

False Myth: Microservices Architecture 1/2

Monolith A CBA A

MONOLITH

Page 7: Modernizing with microservices and fast data

• Achieve great ISOLATION without using synchronous protocols

False Myth: Microservices Architecture 2/2

A B

D E

C

A C E

DB

Message Bus

Monolith

Page 8: Modernizing with microservices and fast data

Respecting Reactive PrinciplesBased on a Lambda Architecture

• Chronos – Data Source• Fates – Batch Layer• NeoCortex – Speed Layer• Hermes – Serving Layer

Omnia – Distributed Data Management Platform

Omnia

Chronos

Fates

Hermes

NeoCortex

Page 9: Modernizing with microservices and fast data

Omnia Chronos – Data Source

Page 10: Modernizing with microservices and fast data

Omnia Chronos

Is in charge to collect/intercept the data from different sources and make them available as streams of observable events.

Observable [ ]•Social media•Facebook•Twitter

•Affiliates

•Page viewing•Articles read, following and followers, bets etc…

•Sports related•Tweets•News

•Gaming

•Web Analytics•Activities with in our applications

Internal Product Centric

ExternalCustomer Centric

{ “type” : “bet”, “version” : “1.0” “time” : “2015-06-03 08:00:31”, “acquisitionTime: “ . . .”, “source” : “WHBetSystem” “payload” : { … any valid json }}

Page 11: Modernizing with microservices and fast data

Omnia Chronos

Adapter Converter PersistenceManager

In Chronos you define streams that collect data and convert/persist into a stream of Observable[Incident].

Chronos

Stream 3

Stream 2

Stream 1

Stream

Page 12: Modernizing with microservices and fast data

Omnia Chronos - Clustering

Chronos 1 Chronos 2 Chronos 3

Twitter

Distributed System Properties:1. Concurrency2. Distribution3. Mobility

Page 13: Modernizing with microservices and fast data

Omnia Chronos

• Chronos is built on top of Akka to leverage: – Referential transparency (Mobility)– Error Kernel Patter (Fail fast and in isolation)– Concurrency and Distribution for Horizontal and Vertical Scalability

• We use Scala Rx API to promote non blocking API to achieve Vertical Scalability

• Data are persisted in Kafka for durability:– Fast Write Operation with Zero Copy and Filesystem Cache– Compaction and Compression to optimise messages consumption

Page 14: Modernizing with microservices and fast data

Vertical Scalability vs Horizontal Scalability

Horizontal – Distribute the load across different machines (Akka Cluster)

Vertical – Maximise local resource utilisation (Non blocking IO + Non blocking API)

Page 15: Modernizing with microservices and fast data

Timing for Machine operationsInstruction Time

Execute typical instruction 1/1,000,000,000 = 1 nanosec

Fetch from L1 cache memory 0.5 nanosec

Branch misprediction 5 nanosec

Fetch from L2 cache memory 7 nanosec

Mutex lock/unlock 25 nanosec

Fetch from main memory 100 nanosec

Send 2K bytes over 1Gbps network 20,000 nanosec (20µs)

Read 1MB sequentially from memory 250,000 nanosec (250µs)

Fetch from new disk location (seek) 8,000,000 (8ms)

Read 1MB sequentially from disk 20,000,000 nanosec (20ms)

Send packet US to Europe and back 150,000,000 nanosec (150ms)

Page 16: Modernizing with microservices and fast data

Humanised TimeInstruction Time

Execute typical instruction 1 s

Fetch from L1 cache memory 0.5 s

Branch misprediction 5 s

Fetch from L2 cache memory 7 s

Mutex lock/unlock ½ s

Fetch from main memory 1½ min

Send 2K bytes over 1Gbps network 5½ hours

Read 1MB sequentially from memory 3 days

Fetch from new disk location (seek) 13 weeks

Read 1MB sequentially from disk 6½ months

Send packet US to Europe and back 5 years

Page 17: Modernizing with microservices and fast data

Omnia Fates

Page 18: Modernizing with microservices and fast data

Fates represents the long term memory of Omnia. Is in charge to organise all the incidents recorded by Chronos into timelines and create new information as views by using machine learning, logical reasoning and time series analysis.

• A timeline represents the history, the sequence of incidents performed by a specific entity over the time. Timelines are organised per categories. An example of timeline can be the customer timeline, which might contain all the bets placed, deposit and withdraw activities, tweets etc... performed by the specific customer. A timeline category is not limited just to customers, it can be anything, for example: Sport Event: football match, competition

• Views are the result of job task that elaborates data from:– Timelines– Other Views

Omnia Fates

Page 19: Modernizing with microservices and fast data

Fates represents the long term memory of Omnia. It organizes the incidents that Chronos collected into timelines and also elaborates new information as views by using machine learning, logical reasoning and time series analysis.

Fates: Batch layer

19

Omnia: Distributed & Reactive platform for data management

Customer: 123

Login

Deposit

Bet placed

Logout

Event: 78

Started

Fault

Penalty

GoalTimelines & Views

Bets Deposits Events Session

FatesBatch Layer

Page 20: Modernizing with microservices and fast data

Timelines are created from timeline streams, each timeline stream read data from a Chronos stream and fed the right timeline.

Omnia FatesCh

rono

s

Fate

s

Page 21: Modernizing with microservices and fast data

• Fates persist timelines of incidents.

• Column Family Name: <TimelineCategory>_tl

• Key Definition: ( (entityId, date), timestamp )

• The partition key is a strong hash key : well balanced Cassandra Cluster• Composite key: incidents are ordered by timestamp under a specific entity within a day

(date = yyyy-MM-dd )

Omnia Fates - Cassandra

Page 22: Modernizing with microservices and fast data

• Multi Data Center application for operation and analytics/reporting• On line analysis against ETL!

Omnia Fates – Separation of Concerns

Page 23: Modernizing with microservices and fast data

Omnia Fates

• We build views with job able to do:

Jobs are performed on top of NeoCortex

Logical Reasoning• Deduction• Induction• Abduction

Time line analysis• Trends• Cycles• Seasonality

Other ML• Classification• Clustering• Predictions

Page 24: Modernizing with microservices and fast data

Omnia Neo Cortex

Page 25: Modernizing with microservices and fast data

Omnia Neo Cortex• NeoCortex is a runtime platform and a set of libraries to perform concurrent and

distributed computations in a highly resilient way.• Was initially desgined as a library on top of spark (streaming) but it evolved in a

platform for Reactive Microservice which allows to build application in:– SPARK STREAMING– AKKA STREAMS– WILLIAM HILL LAMBDAS

• Applications are deployed in Neocortex as docker isolated microservices and they can interact each other using chronos streams and with client applications through Hermes.

Page 26: Modernizing with microservices and fast data

Omnia Neo Cortex – SPARK STREAMING

Page 27: Modernizing with microservices and fast data

Omnia Neo Cortex - Parallelism

chronosstream

Driver

Executor 1

Executor 2

Executor 3

Executor 4

Executor 3

Executor 4

Hermes

(Serving Layer)

Stage 1(map)

Stage 2(reduceByKey)

Fatestimelinesviews

Page 28: Modernizing with microservices and fast data

Neocortex - Hiding Complexity

Page 29: Modernizing with microservices and fast data

Omnia Hermes

Page 30: Modernizing with microservices and fast data

Omnia HermesIs the layer on which data get represented for consumption: B2B and B2C. At its foundation micro-services, notifications and data as API are key aspects of the design

Scalable and simple full duplex communication for the web

Express the correlation between the entities of the model

Inspired by Falcor (Netflix) and GraphQL (Facebook)

Page 31: Modernizing with microservices and fast data

Omnia Hermes

Herm

esDi

strib

uted

Cac

he

Hermes Node

Loca

l Cac

he

Subs

crip

tion

Man

ager

Clie

nt M

anag

er

Auth

entic

ation

Han

dler

Dispatcher

HTTP

WS

TCP

Browser

Herm

es JS

WH

Apps

Chro

nos

Page 32: Modernizing with microservices and fast data

Omnia Infrastructure – Mesos/Marathon/Docker

Page 33: Modernizing with microservices and fast data

Omnia Infrastructure

Omnia

Docker

Marathon

Mesos

Node Node Node Node Node

Page 34: Modernizing with microservices and fast data

Questions

34