the single most important decision in designing your distributed system

36
Your systems. Working as one. Your systems. Working as one. The Single Most Important Decision in Designing Your Distributed System Stan Schneider, Ph.D. CEO

Upload: real-time-innovations-rti

Post on 24-Jun-2015

1.292 views

Category:

Technology


3 download

DESCRIPTION

Watch on-demand here: http://ecast.opensystemsmedia.com/339 Distributed systems work by sending information between otherwise independent applications. Traditionally, that communication is done by passing messages between the various nodes. This "message-centric" approach takes many forms, from simple direct transmissions to more complex message queue and transactional systems. All have a common premise: the unit of information exchange is the message itself. The infrastructure's role is to ensure that messages get to their intended recipients. Recently, another paradigm is becoming popular. In this approach, the distributed infrastructure takes more responsibility; it offers to the distributed system a single version of "truth." The fundamental unit of communication is a data-object value; the infrastructure has done its job not when a message is delivered, but when all nodes have the correct understanding of that value. Because the focus is on the data itself, this is termed "data-centric" infrastructure. While both types of middleware serve to connect distributed systems, the approaches are quite different. And the single most important decision you make when designing your distributed system – whether to go with the message-centric or data-centric approach – will result in different system capabilities, strengths, and weaknesses. In this webinar, we will examine both, discuss the differences and clarify which application use cases are best served by data- and message-centric designs.

TRANSCRIPT

Page 1: The Single Most Important Decision in Designing Your Distributed System

Your systems. Working as one.Your systems. Working as one.

The Single Most Important Decision in Designing Your Distributed System

Stan Schneider, Ph.D.CEO

Page 2: The Single Most Important Decision in Designing Your Distributed System

Distributed Systems

Page 3: The Single Most Important Decision in Designing Your Distributed System

Driver: Scale

• More things producing and consuming data

• Greater volume of data• System of systems

integrationSystem of systems

Page 4: The Single Most Important Decision in Designing Your Distributed System

Distributed Infrastructure Functions

• Communicate between concurrent apps• Share and manage state• Discover participants and data• Add & delete participants• Enforce rules of proper operation• Deal with & recover from errors• And on and on and on…

Page 5: The Single Most Important Decision in Designing Your Distributed System

Communications Paradigms

What do applications exchange?

Page 6: The Single Most Important Decision in Designing Your Distributed System

Message Centric Approach

• Traditional middleware exchanges messages• Infrastructure is unaware of the content• Developers write applications that send messages

between participants

Popular standards: JMS API; AMQP wire spec

Page 7: The Single Most Important Decision in Designing Your Distributed System

Data Centric Approach

• Data-centric middleware maintains state• Infrastructure manages the content• Developers write applications that read and update a virtual

global data space

PersistenceService

RecordingService

Source(Key) Power Phase

WPT1 37.4 122.0 -12.20

WPT2 10.7 74.0 -12.23

WPTN 50.2 150.07 -11.98

Popular standards: DDS API, wire spec

Page 8: The Single Most Important Decision in Designing Your Distributed System

Middleware Evolution

Point-to-Point Client/Server Publish/Subscribe

BrokeredESB

Daemon

Pub/Sub Messaging

Data-CentricPublish/Subscribe (DCPS)

Data-Centric

Page 9: The Single Most Important Decision in Designing Your Distributed System

A Simple Distributed Thermometer

• Message-centric “verbs”

– Individual messages: “set sensor B3 to 27”

– Application processed• Data-centric “nouns”

– Update space: B3-> 27

• Late joiner?– Message: send

everything again– Data: Just read last

value

Alarm Monitor

Process Control

Quality Logging

Boiler B3 Temp

Page 10: The Single Most Important Decision in Designing Your Distributed System

Sharing and Managing State

Who manages truth?

Page 11: The Single Most Important Decision in Designing Your Distributed System

Managing State

• Message centric– Infrastructure doesn’t

know state– MW controls message

behavior. Delivers all indistinguishable message equally

– Applications manage state

• Data centric– Infrastructure knows

schemas, behavior– MW controls data

behavior. Can deliver messages based on their content and properties.

– Infrastructure disseminates and manages state

Page 12: The Single Most Important Decision in Designing Your Distributed System

What is truth?

• Data-centric architecture is responsible for managing “truth”– How reliable is the information?– How old is it?– How many past versions are saved?– What delivery guarantees are there?– What happens if a producer fails?

• Called “Quality of Service” by DDS

Page 13: The Single Most Important Decision in Designing Your Distributed System

What’s the Temperature?

• Message centric– Messages readings, changes,

anything– Deliver all message equally– Application determines truth

• Data centric– Knows temp is a float– Pick and choose who to

deliver it to– MW delivers truth

Alarm Monitor

Process Control

Quality Logging

Boiler B3 Temp

Page 14: The Single Most Important Decision in Designing Your Distributed System

Coupling

• Verb-based: applications interact with each other

• Noun-based: applications interact with data model

Set B3 to 27 B3

Page 15: The Single Most Important Decision in Designing Your Distributed System

Database Analogy

• Files: “message” centric– Application defined structure– Simpler– No common tools, consistency

• Database: data centric– Key value: source of consistent truth– Known schema– CRUD interface

• Data-centric middleware is data-centric infrastructure for moving data

Page 16: The Single Most Important Decision in Designing Your Distributed System

State & Scaling…an Example

Page 17: The Single Most Important Decision in Designing Your Distributed System

State & Scaling

Node/Service

Interface: Colors denote uniqueness

Data State: Color denote uniqueness

Page 18: The Single Most Important Decision in Designing Your Distributed System

Application Centric Development

• Scales O(n2) only if– each System invokes the Specified Interface of the System

that it wishes to communicate with and no application state

– n is the number of interfaces

Page 19: The Single Most Important Decision in Designing Your Distributed System

Application Centric Development

• Scales O(n3)– Each system must understand the interaction

patterns and the remote data state….– n is the number of data states

Page 20: The Single Most Important Decision in Designing Your Distributed System

Application Centric Development

• O(n3) Scaling – ~8 times more

complex

Page 21: The Single Most Important Decision in Designing Your Distributed System

Message Centric Development

• Delegate syntacticinteroperability– Scales O(n2)– State is still in Apps

Page 22: The Single Most Important Decision in Designing Your Distributed System

Data-Centric Development

• Delegate state to middleware– Enables interoperability– Scales O(n)– State synced with middleware

Page 23: The Single Most Important Decision in Designing Your Distributed System

Data-Centric Development

• Delegate state to middleware– Enables interoperability– Scales O(n)– State synced with middleware

– Can move state into the middlewareand infrastructure

– Data-Centric Pub/Sub

Page 24: The Single Most Important Decision in Designing Your Distributed System

Example: Data-Centric ‘Track’

• Middleware knows what a “track” is• Each is an identified object; CRUD interface• System maintains the track list

Publish

Subscribe

New

45.6

78.9

“ID:UA23”Update

56.7

89.0

“ID:UA23”New

65.4

32.1

“ID:DL87”Dispose

“ID:UA23”

X

Page 25: The Single Most Important Decision in Designing Your Distributed System

The Single Most Important Decision?

Will the apps manage truth & The infrastructure exchange messages?

Or

Will the infrastructure manage truth & The apps exchange state?

Page 26: The Single Most Important Decision in Designing Your Distributed System

What to Consider?

Design Factors

Page 27: The Single Most Important Decision in Designing Your Distributed System

Not Black & White

Data Centric

Message Centric

Page 28: The Single Most Important Decision in Designing Your Distributed System

How Important is Controlled State?

• Unmanaged state leads quickly to inconsistency• Data centric pros

– Multiple consumers access consistent state– Clear data exchange rules– Failover, tools, selective delivery natural; simpler

applications• Data centric cons

– Agree on a shared data model (!)– Upfront data model, property design– Less design flexibility in applications

Page 29: The Single Most Important Decision in Designing Your Distributed System

How Challenging is Integration?

• With data-centric, all apps can examine, match, and deliver the right data

• Components learn schema and thus interoperate

• Enables generic tools, reusable applications, selective delivery, and structured designs

Page 30: The Single Most Important Decision in Designing Your Distributed System

Distributing Work or Information?

• Message-centric best when distributing “things” and “processing” them.

– Very mature message-centric techniques– Transactional behavior – Information is naturally centralized – Load balancing that doles out single messages across many servers

• Data-centric systems excel at scalable information distribution– Efficient fanout– Selective control of bandwidth & delivery– Tight latency or real-time delivery constraints– Easier discovery, late-joiner updates– Integrating many different functions

04/13/2023 30

Page 31: The Single Most Important Decision in Designing Your Distributed System

What’s the Project Lifecycle?

• Messaging simpler connecting point A to point B– Fielded in a few months– Limited to a small number of cooperating applications– Developed by a cohesive team– Short lifetime

• Data-centric systems build lasting architecture– Upfront data model, interface design– Less familiar concepts– Can add applications, scale, and evolve– Distributed teams, complex system integration, and long-

term viability justify initial investment

Page 32: The Single Most Important Decision in Designing Your Distributed System

How to Choose?

Page 33: The Single Most Important Decision in Designing Your Distributed System

Clarify requirements & factors

• Develop requirements– Map out data content and exchange – Consider scenarios that expose coupling– Analyze performance and scaling– Plan for evolution

• Consider key factors– Controlled state– Integration– Work or information– Lifecycle

Page 34: The Single Most Important Decision in Designing Your Distributed System

Choose the best fit

• Data centric and message centric are both proven technologies

• Both can solve many (any?) problem

• Choose the approach that reduces coupling, delivers performance, and supports your long-term architecture

Page 35: The Single Most Important Decision in Designing Your Distributed System

DownloadConnextFree TrialNOW

www.rti.com/downloads

Page 36: The Single Most Important Decision in Designing Your Distributed System

Your Systems.Working as OneSM