the single most important decision in designing your distributed system
DESCRIPTION
Watch on-demand here: http://ecast.opensystemsmedia.com/339 Distributed systems work by sending information between otherwise independent applications. Traditionally, that communication is done by passing messages between the various nodes. This "message-centric" approach takes many forms, from simple direct transmissions to more complex message queue and transactional systems. All have a common premise: the unit of information exchange is the message itself. The infrastructure's role is to ensure that messages get to their intended recipients. Recently, another paradigm is becoming popular. In this approach, the distributed infrastructure takes more responsibility; it offers to the distributed system a single version of "truth." The fundamental unit of communication is a data-object value; the infrastructure has done its job not when a message is delivered, but when all nodes have the correct understanding of that value. Because the focus is on the data itself, this is termed "data-centric" infrastructure. While both types of middleware serve to connect distributed systems, the approaches are quite different. And the single most important decision you make when designing your distributed system – whether to go with the message-centric or data-centric approach – will result in different system capabilities, strengths, and weaknesses. In this webinar, we will examine both, discuss the differences and clarify which application use cases are best served by data- and message-centric designs.TRANSCRIPT
Your systems. Working as one.Your systems. Working as one.
The Single Most Important Decision in Designing Your Distributed System
Stan Schneider, Ph.D.CEO
Distributed Systems
Driver: Scale
• More things producing and consuming data
• Greater volume of data• System of systems
integrationSystem of systems
Distributed Infrastructure Functions
• Communicate between concurrent apps• Share and manage state• Discover participants and data• Add & delete participants• Enforce rules of proper operation• Deal with & recover from errors• And on and on and on…
Communications Paradigms
What do applications exchange?
Message Centric Approach
• Traditional middleware exchanges messages• Infrastructure is unaware of the content• Developers write applications that send messages
between participants
Popular standards: JMS API; AMQP wire spec
Data Centric Approach
• Data-centric middleware maintains state• Infrastructure manages the content• Developers write applications that read and update a virtual
global data space
PersistenceService
RecordingService
Source(Key) Power Phase
WPT1 37.4 122.0 -12.20
WPT2 10.7 74.0 -12.23
WPTN 50.2 150.07 -11.98
Popular standards: DDS API, wire spec
Middleware Evolution
Point-to-Point Client/Server Publish/Subscribe
BrokeredESB
Daemon
Pub/Sub Messaging
Data-CentricPublish/Subscribe (DCPS)
Data-Centric
A Simple Distributed Thermometer
• Message-centric “verbs”
– Individual messages: “set sensor B3 to 27”
– Application processed• Data-centric “nouns”
– Update space: B3-> 27
• Late joiner?– Message: send
everything again– Data: Just read last
value
Alarm Monitor
Process Control
Quality Logging
Boiler B3 Temp
Sharing and Managing State
Who manages truth?
Managing State
• Message centric– Infrastructure doesn’t
know state– MW controls message
behavior. Delivers all indistinguishable message equally
– Applications manage state
• Data centric– Infrastructure knows
schemas, behavior– MW controls data
behavior. Can deliver messages based on their content and properties.
– Infrastructure disseminates and manages state
What is truth?
• Data-centric architecture is responsible for managing “truth”– How reliable is the information?– How old is it?– How many past versions are saved?– What delivery guarantees are there?– What happens if a producer fails?
• Called “Quality of Service” by DDS
What’s the Temperature?
• Message centric– Messages readings, changes,
anything– Deliver all message equally– Application determines truth
• Data centric– Knows temp is a float– Pick and choose who to
deliver it to– MW delivers truth
Alarm Monitor
Process Control
Quality Logging
Boiler B3 Temp
Coupling
• Verb-based: applications interact with each other
• Noun-based: applications interact with data model
Set B3 to 27 B3
Database Analogy
• Files: “message” centric– Application defined structure– Simpler– No common tools, consistency
• Database: data centric– Key value: source of consistent truth– Known schema– CRUD interface
• Data-centric middleware is data-centric infrastructure for moving data
State & Scaling…an Example
State & Scaling
Node/Service
Interface: Colors denote uniqueness
Data State: Color denote uniqueness
Application Centric Development
• Scales O(n2) only if– each System invokes the Specified Interface of the System
that it wishes to communicate with and no application state
– n is the number of interfaces
Application Centric Development
• Scales O(n3)– Each system must understand the interaction
patterns and the remote data state….– n is the number of data states
Application Centric Development
• O(n3) Scaling – ~8 times more
complex
Message Centric Development
• Delegate syntacticinteroperability– Scales O(n2)– State is still in Apps
Data-Centric Development
• Delegate state to middleware– Enables interoperability– Scales O(n)– State synced with middleware
Data-Centric Development
• Delegate state to middleware– Enables interoperability– Scales O(n)– State synced with middleware
– Can move state into the middlewareand infrastructure
– Data-Centric Pub/Sub
Example: Data-Centric ‘Track’
• Middleware knows what a “track” is• Each is an identified object; CRUD interface• System maintains the track list
Publish
Subscribe
New
45.6
78.9
“ID:UA23”Update
56.7
89.0
“ID:UA23”New
65.4
32.1
“ID:DL87”Dispose
“ID:UA23”
X
The Single Most Important Decision?
Will the apps manage truth & The infrastructure exchange messages?
Or
Will the infrastructure manage truth & The apps exchange state?
What to Consider?
Design Factors
Not Black & White
Data Centric
Message Centric
How Important is Controlled State?
• Unmanaged state leads quickly to inconsistency• Data centric pros
– Multiple consumers access consistent state– Clear data exchange rules– Failover, tools, selective delivery natural; simpler
applications• Data centric cons
– Agree on a shared data model (!)– Upfront data model, property design– Less design flexibility in applications
How Challenging is Integration?
• With data-centric, all apps can examine, match, and deliver the right data
• Components learn schema and thus interoperate
• Enables generic tools, reusable applications, selective delivery, and structured designs
Distributing Work or Information?
• Message-centric best when distributing “things” and “processing” them.
– Very mature message-centric techniques– Transactional behavior – Information is naturally centralized – Load balancing that doles out single messages across many servers
• Data-centric systems excel at scalable information distribution– Efficient fanout– Selective control of bandwidth & delivery– Tight latency or real-time delivery constraints– Easier discovery, late-joiner updates– Integrating many different functions
04/13/2023 30
What’s the Project Lifecycle?
• Messaging simpler connecting point A to point B– Fielded in a few months– Limited to a small number of cooperating applications– Developed by a cohesive team– Short lifetime
• Data-centric systems build lasting architecture– Upfront data model, interface design– Less familiar concepts– Can add applications, scale, and evolve– Distributed teams, complex system integration, and long-
term viability justify initial investment
How to Choose?
Clarify requirements & factors
• Develop requirements– Map out data content and exchange – Consider scenarios that expose coupling– Analyze performance and scaling– Plan for evolution
• Consider key factors– Controlled state– Integration– Work or information– Lifecycle
Choose the best fit
• Data centric and message centric are both proven technologies
• Both can solve many (any?) problem
• Choose the approach that reduces coupling, delivers performance, and supports your long-term architecture
DownloadConnextFree TrialNOW
www.rti.com/downloads
Your Systems.Working as OneSM