dataflow: the concurrency/parallelism architecture you need
Post on 15-Jan-2015
171 Views
Preview:
DESCRIPTION
TRANSCRIPT
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow:
Russel Winder@russel_winder http://www.russel.org.ukrussel@winder.org.uk
The Concurrency/ParallelismArchitecture You Need
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What is Dataflow?
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What are (in computing†):
Concurrency:
Structuring solution and code such that multiple parts may execute independently and possibly even at the same time.
Parallelism:
Execute multiple parts of a system at the same time on different processors so as to get things working faster.
†In natural language these words have very different meanings.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
What is Dataflow?
An architecture comprising channels allowing data to flow from one operator to another, where each operator has multiple input channels and multiple output channels, and executes code only in response to the arrival of data on the inputs.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Historically
Dataflow computers:– Values flowing between…–…operators that calculate…–…new values to pass to…–…other operators.
Dataflow hardware didn't take off, but the architecture works at various scales.
The Manchester Prototype Dataflow Computer J R Gurd, C C Kirkham, I WatsonCACM 28(1), 1985-01.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow diagrams have been anintegral part of analysis and design ofinformation systems since the 1970s
T de Marco, Structured Analysis and Systems Specification,Yourdon Press, NY, 1978.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow and Functional
Operators seem like they might be pure functions, but…
…they are not necessarily, operators may have internal state.
Operators may be referentially transparent, but they may be not.
Operators may even have side effects.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is anevent-basedarchitecture
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems are(possibly)
reactive systems.
Which would make them exceedinglytrendy even if the idea is very old.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems have
no†
shared memory.
† or at least should have no.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
operatorchannel
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow systems aremessage passing systems.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Each operator must†
be single threaded.
† or at least should.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow Frameworks
Scala:–Future
Akka:–Dataflow variables, aka
Promise–Deprecated in favour of Async
Java:–Pre-8, Future–8+, CompletableFuture, aka
Promise
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Architectural Issue
Each of the aforementioned frameworks assumes that each operator creates a single value. Communication is by dataflow variables: each dataflow variable is a thread-safe single assignment variable.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
GPars…
Has dataflow variables (promises) and tasks and so can do everything Akka and Java can offer.
Has DataflowQueue, and so can create real dataflow networks.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
One does like to code…
…doesn't one.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
We need a problem…
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
A Problem
Calculate mean and standard deviation of a data sample.
x̄ =1n∑i=0
nxi
s = √ 1n−1∑i=0
n(x i− x̄)2
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Amend the Problem
s = √ 1n−1 ( (∑i=0
nx i
2 )−n x̄ x̄ )
x̄ =1n∑i=0
nxi
@YourTwitterHandle@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Code
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Switch to using an IDE for this.Switch to using an IDE for this.
Code Example
@YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Sum
mar
y
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Summary
Dataflow is an architecture:
Event-driven, single-threaded operators communicating by message passing using channels.
Dataflow is an easement:
Synchronization is inherent in the model, and there is no shared memory, so all deadlocks are trivial.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is a way of harnessingconcurrency and parallelism
in easy to program ways.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
GPars is usable from Javaas well as Groovy.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Testing is really Groovy with Spock.
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow is an architecture ofcode you need to know.
@YourTwitterHandle#DVXFR14{session hashtag} @russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Q &
A
@russel_winder#devoxxuk #dataflowrules Copyright © 2014 Russel Winder
Dataflow:
Russel Winder@russel_winder http://www.russel.org.ukrussel@winder.org.uk
The Concurrency/ParallelismArchitecture You Need
top related