building reliable systems from unreliable components
DESCRIPTION
Reliability and availability in SOATRANSCRIPT
Building reliable systems from unreliable components
Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz
What’s in a 9
0.99 reliability
We have a nice little legacy business component
0.99
0.99
0.99 0.99
0.99
0.99 0.99
0.99
0.99
And we move it to SOA
Failsafe hardware
Status Technologies FT Server
Or try to detect failure , handle it and minimize its effect on the business service
© Rosendahl
BEFORE WE BEGIN
SOA
Service
describes
End Point Exposes
Messages Sends/Receives
Contracts
Binds to
Service Consumer
implements
Policy governed by
Sends/Receives
Adheres to
Component
Relation
Key
Understands
Serves
SOA is derived from other styles
Pipes and Filters
Client Server
Distributed Agents
Layered System
Stateless Comm.
SOA
SOA vs. REST
Pipes and Filters
Client Server
Uniform Interface
Virtual Machine
Distributed Agents
Layered System
Replicated Repository
Code On Demand
Stateless Comm.
Cacheable
REST SOA
3G Video Calls
MMS Dedicated
Client
Mobile Integration
Applications
Acquisition Interactions branding
System
Ad Management
Interactions Reference
Data Links
Resources
Monitoring
Targeted Advertizing
Campaign Mgmt.
Usage Datmart
Reporting
Billing
Data mining & Statistics
Reports
Link Managment
Publishing tools
integration
Interaction Designer
Web Front-end
Data Interfaces
3rd parties
Load balancer
Web Server
(IIS/Apache
App Server
App Server
App Server
App Server
App Server
App Server
App Server
Web Server
(IIS/Apache
MMS Gateway
3G Gateway
3G Gateway
Load balancer
Firewall
Firewall
BI & Reporting
DB Links
Datamart usage
DB References
Links
Registeration
Sync. Server
DMZ
Operational
Backend
Advertizing clients
Fire
wal
l Web Server
(IIS/Apache
NMS Paper Editor
Smart phones
DMZ
Camera Phones
Admin Console
CHALLENGE – SERVICE AVAILABILITY
What’s the effect of a failure - Server
E1 line = 30 concurrent video
calls
Call Flow Service
request
reaction
Edge Service Instance
Dispatcher
Distribute
End point
Service Business logic
Service Instance
What’s the effect of a failure - Channel
E1 line = 30 concurrent video
calls
Call Flow Channel
Call Flow Channel Call Flow
Channel Call Flow Channel Call Flow
Channel Call Flow Channel Call Flow
Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel
Call Flow Channel Call Flow
Channel
Service Instance with NLB
Service Instance
NLB Driver
Cluster Host
NIC Driver
TCP/IP
Windows Kernel
NIC
NLB Driver
Cluster Host
NIC Driver
TCP/IP
Windows Kernel
NIC
Virtual IP : 1.1.1.1
Real IP : 1.1.1.2 Real IP : 1.1.1.3
Service Instance Edge
Windows Host
NIC Driver
TCP/IP
Windows Kernel
NIC
Real IP : 1.1.1.4
Virtual Endpoint
Request/Reply
Service
EndPoint
Synchronous processing
1. Request 2.
3. Reply
Service Consumer
Things look Cool & Simple ™
Session negations
Image extraction
identification Translation
to links
Render resutls
3G Call
IVP (RV) 3G GW
(RV)
3G VAS (Cestel)
WS
Resource Manager
SIP Listner
RTP Image Extractor
Alg. Engine
Dispatcher
WebConnector
3G Builder (Cestel)
WebRenderer
Turn out Complicated & Ugly ™
Relation
Key
SOA Component Pattern Component
Concern/attribute
Edge pipeline
Perform Task
EndPoint
Service
Request
Reaction
EndPoint
pipeline
Perform Task
EndPoint
pipeline
Perform Task
EndPoint
Queue
Request 2
Request 1
Parallel Pipelines
Inversion of Communications
Consumer view
var sendMmsEvent = new SendMmsEvent()
{
FromNumber = simpleMessageDetails.DialedNumber,
Subject = mmsContents.Subject,
ToNumber = simpleMessageDetails.Sender,
ImageExtension = mmsContents.ImageExtension,
ImageAsByteArray = mmsContents.Image,
TextAsByteArray = mmsContents.Text
};
eventBroker.RaiseEvent(sendMmsEvent);
http://www.flickr.com/photos/crimson_wolf/2851737125/sizes/l/
Service view
public interface ImPostOffice : ImContract, IHandleSendCoupon,
IHandleSendSms,IHandleStatus,IHandleAdminStatus,IHandleWapLink,
IHandleSendMms
{
}
[ServiceContract]
public interface IHandleSendMms
{
[OperationContract]
int SendMms(SendMmsEvent eventOccured);
}
[ServiceContract]
[DataContract]
public class SendMmsEvent : ImEvent
{
/// <summary>
/// end user's number. should be in international format: +[country-code]number. Example: +491737692260
/// </summary>
[DataMember]
public string ToNumber { get; set; }
/// <summary>
/// service's number, usually a short-code. Example: 84343
/// </summary>
[DataMember]
public string FromNumber { get; set; }
/// <summary>
/// Text, as byte array. Use Encoding classes to do it.
/// </summary>
[DataMember]
public byte[] TextAsByteArray { get; set; }
/// <summary>
/// Image, as byte array. Can be: jpg, gif, png, bmp. (jpg rulez!!)
/// </summary>
[DataMember]
public byte[] ImageAsByteArray { get; set; }
/// <summary>
/// Remeber <c>ImageAsByteArray</c>? - This is where you manaually tell us what's the extension. Yes, we can inspect the signature, but why?
/// </summary>
[DataMember]
public string ImageExtension { get; set; }
/// <summary>
/// the mms message should have a subject. just put something there.
/// </summary>
[DataMember]
public string Subject { get; set; }
Edge translates external structures to internal ones
public int SendMms(SendMmsEvent eventOccured)
{
var eventContext = eventOccured.ToString();
if (log.IsDebugEnabled)
log.Debug("inside 'SendMms', event context = [" + eventContext + "]");
var fromNumber = eventOccured.FromNumber;
var sender = mmsSenderFactory.Get(fromNumber);
if (null == sender)
{
if (log.IsWarnEnabled)
log.Warn("cannot get mms sender derived from '" + (fromNumber ?? "null") + "'");
return 0;
}
IMmsSubmitResponse response;
try
{
var mmsMessageDetails = new MmsMessageDetails(eventOccured.ToNumber,
eventOccured.TextAsByteArray,
eventOccured.ImageAsByteArray,
eventOccured.ImageExtension,
eventOccured.Subject);
response = sender.Submit(mmsMessageDetails);
}
catch (Exception ex)
{
log.Error("cannot send mms message, context = [" + eventContext + "]", ex);
return 0;
}
if (log.IsInfoEnabled)
{
var responseMessage = (null == response) ? "null" : response.ToString();
log.Info("sent mms with event context = [" + eventContext + "], response = [" +
responseMessage + "]");
}
Sagas tie instances together for conversations
SIP
Image
Extractor
Identification
Call Recovery
Alternative : Orchestration
request
Workflow Engine
Workflow instance
Manage Process
route request
Host Workflows
Schedule
Orchestration platform
Service Service
reaction
Auxiliary tools
Coordinator
Protocol
Offline designer
monitor
Be Wary of Nano-Services
CHALLENGE - MANAGEMENT
Edge
Watchdog Edge
Report
EndPoint
Service
Request
Monitor
EndPoint
Watchdog Agent
Monitor
Heal
Log
Reports
Monitor
Monitor
Monitor
Blogjecting Watchdog
Blogjects concept is about collaborating objects
WatchDog
Service A Service B WDWatcher
SIP
Image
Extractor
Identification
Call Recovery
WatchDog
Status
Service Monitor Edge/Service
In Commands
Metrics collection
Policy governance
Security monitoring
Fault Monitoring Reporting &
Dashboarding
Control
Edge/Service
Status
Monitor Act
Collect
Notify
Service Monitor
3 2 1 root
/
Sessions/
Abcde/
Efgh/
Resources/ Dispatchers/ Xyz/
RESTful resource management
http://devrig:52141/RM/Sessions/abc/
• ATOMPUB
– Session details
• URI (ID)
• State (start/end/status etc.)
• Resources – Knows status
– URI for the Resource representation on the RM
– URI for the Resource itself
Keep the BIT
http://www.flickr.com/photos/37643027@N00/2050024263/sizes/o/
SIP
Image
Extractor
Identification X
Call Recovery
WatchDog
WatchDog
Liveliness Monitor
3G Call
CHALLENGE – MULTI-TENNANCY (LIES, I TELL YOU, ALL LIES)
Same event different subscribers
Call Flow
Player (interaction Renderer)
Call Flow Bridge to 3rd
Party
Play Movie Event
Routing [ServiceContract]
[Participate("3G")]
public interface ImPlayer : ImContract, IHandleCallStarted, IHandleCallEnded,
IHandlePlayMovie,IHandleCallAborted
{
}
[ServiceContract]
[Participate("3GPartner")]
public interface ImXsightsGateWay : ImContract,
IHandleCallAborted,
IHandlePlayMovie,
IHandleReadyForSearch,
IHandleSearchStarted,
IHandleJoinThirdParty
{
}
Initiator A
Initiator B
Participant B Capacity : 1
Raise a saga initiating event
Now what ?!
Participant A Capacity : 2
Reservation Pattern
Event Broker
Service Host
Resource Allocator
Business Logic
Control Edge
Service Host
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge
Service Instance #1
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge
Service Instance #2
Service Host
Business Logic
Event Broker
Service Host
Resource Allocator
Control Edge Reservation
Good old 2PC to the
rescue
Takeaways
• Things break • Decouple • Fail Fast • Monitor & Detect • Compensate
• Throw state unto others
Arnon Rotem-Gal-Oz
@arnonrgo http://arnon.me
http://www.cloudvalue.com