building reliable systems from unreliable components

52
Building reliable systems from unreliable components Arnon Rotem-Gal-Oz

Upload: arnon-rotem-gal-oz

Post on 10-May-2015

4.588 views

Category:

Technology


2 download

DESCRIPTION

Reliability and availability in SOA

TRANSCRIPT

Page 1: Building reliable systems from unreliable components

Building reliable systems from unreliable components

Arnon Rotem-Gal-Oz

Page 2: Building reliable systems from unreliable components

Arnon Rotem-Gal-Oz

Page 3: Building reliable systems from unreliable components

What’s in a 9

Page 4: Building reliable systems from unreliable components

0.99 reliability

We have a nice little legacy business component

Page 5: Building reliable systems from unreliable components

0.99

0.99

0.99 0.99

0.99

0.99 0.99

0.99

0.99

And we move it to SOA

Page 6: Building reliable systems from unreliable components

Failsafe hardware

Status Technologies FT Server

Page 7: Building reliable systems from unreliable components

Or try to detect failure , handle it and minimize its effect on the business service

© Rosendahl

Page 8: Building reliable systems from unreliable components

BEFORE WE BEGIN

Page 9: Building reliable systems from unreliable components

SOA

Service

describes

End Point Exposes

Messages Sends/Receives

Contracts

Binds to

Service Consumer

implements

Policy governed by

Sends/Receives

Adheres to

Component

Relation

Key

Understands

Serves

Page 10: Building reliable systems from unreliable components

SOA is derived from other styles

Pipes and Filters

Client Server

Distributed Agents

Layered System

Stateless Comm.

SOA

Page 11: Building reliable systems from unreliable components

SOA vs. REST

Pipes and Filters

Client Server

Uniform Interface

Virtual Machine

Distributed Agents

Layered System

Replicated Repository

Code On Demand

Stateless Comm.

Cacheable

REST SOA

Page 12: Building reliable systems from unreliable components
Page 13: Building reliable systems from unreliable components

3G Video Calls

MMS Dedicated

Client

Mobile Integration

Applications

Acquisition Interactions branding

System

Ad Management

Interactions Reference

Data Links

Resources

Monitoring

Targeted Advertizing

Campaign Mgmt.

Usage Datmart

Reporting

Billing

Data mining & Statistics

Reports

Link Managment

Publishing tools

integration

Interaction Designer

Web Front-end

Data Interfaces

3rd parties

Page 14: Building reliable systems from unreliable components

Load balancer

Web Server

(IIS/Apache

App Server

App Server

App Server

App Server

App Server

App Server

App Server

Web Server

(IIS/Apache

MMS Gateway

3G Gateway

3G Gateway

Load balancer

Firewall

Firewall

BI & Reporting

DB Links

Datamart usage

DB References

Links

Registeration

Sync. Server

DMZ

Operational

Backend

Advertizing clients

Fire

wal

l Web Server

(IIS/Apache

NMS Paper Editor

Smart phones

DMZ

Camera Phones

Admin Console

Page 15: Building reliable systems from unreliable components

CHALLENGE – SERVICE AVAILABILITY

Page 16: Building reliable systems from unreliable components

What’s the effect of a failure - Server

E1 line = 30 concurrent video

calls

Call Flow Service

Page 17: Building reliable systems from unreliable components

request

reaction

Edge Service Instance

Dispatcher

Distribute

End point

Service Business logic

Service Instance

Page 18: Building reliable systems from unreliable components

What’s the effect of a failure - Channel

E1 line = 30 concurrent video

calls

Call Flow Channel

Call Flow Channel Call Flow

Channel Call Flow Channel Call Flow

Channel Call Flow Channel Call Flow

Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel

Call Flow Channel Call Flow

Channel

Page 19: Building reliable systems from unreliable components

Service Instance with NLB

Service Instance

NLB Driver

Cluster Host

NIC Driver

TCP/IP

Windows Kernel

NIC

NLB Driver

Cluster Host

NIC Driver

TCP/IP

Windows Kernel

NIC

Virtual IP : 1.1.1.1

Real IP : 1.1.1.2 Real IP : 1.1.1.3

Service Instance Edge

Windows Host

NIC Driver

TCP/IP

Windows Kernel

NIC

Real IP : 1.1.1.4

Page 20: Building reliable systems from unreliable components

Virtual Endpoint

Page 21: Building reliable systems from unreliable components

Request/Reply

Service

EndPoint

Synchronous processing

1. Request 2.

3. Reply

Service Consumer

Page 22: Building reliable systems from unreliable components

Things look Cool & Simple ™

Session negations

Image extraction

identification Translation

to links

Render resutls

3G Call

Page 23: Building reliable systems from unreliable components

IVP (RV) 3G GW

(RV)

3G VAS (Cestel)

WS

Resource Manager

SIP Listner

RTP Image Extractor

Alg. Engine

Dispatcher

WebConnector

3G Builder (Cestel)

WebRenderer

Turn out Complicated & Ugly ™

Page 24: Building reliable systems from unreliable components

Relation

Key

SOA Component Pattern Component

Concern/attribute

Edge pipeline

Perform Task

EndPoint

Service

Request

Reaction

EndPoint

pipeline

Perform Task

EndPoint

pipeline

Perform Task

EndPoint

Queue

Request 2

Request 1

Parallel Pipelines

Page 25: Building reliable systems from unreliable components

Inversion of Communications

Page 26: Building reliable systems from unreliable components

Consumer view

var sendMmsEvent = new SendMmsEvent()

{

FromNumber = simpleMessageDetails.DialedNumber,

Subject = mmsContents.Subject,

ToNumber = simpleMessageDetails.Sender,

ImageExtension = mmsContents.ImageExtension,

ImageAsByteArray = mmsContents.Image,

TextAsByteArray = mmsContents.Text

};

eventBroker.RaiseEvent(sendMmsEvent);

http://www.flickr.com/photos/crimson_wolf/2851737125/sizes/l/

Page 27: Building reliable systems from unreliable components

Service view

public interface ImPostOffice : ImContract, IHandleSendCoupon,

IHandleSendSms,IHandleStatus,IHandleAdminStatus,IHandleWapLink,

IHandleSendMms

{

}

[ServiceContract]

public interface IHandleSendMms

{

[OperationContract]

int SendMms(SendMmsEvent eventOccured);

}

[ServiceContract]

[DataContract]

public class SendMmsEvent : ImEvent

{

/// <summary>

/// end user's number. should be in international format: +[country-code]number. Example: +491737692260

/// </summary>

[DataMember]

public string ToNumber { get; set; }

/// <summary>

/// service's number, usually a short-code. Example: 84343

/// </summary>

[DataMember]

public string FromNumber { get; set; }

/// <summary>

/// Text, as byte array. Use Encoding classes to do it.

/// </summary>

[DataMember]

public byte[] TextAsByteArray { get; set; }

/// <summary>

/// Image, as byte array. Can be: jpg, gif, png, bmp. (jpg rulez!!)

/// </summary>

[DataMember]

public byte[] ImageAsByteArray { get; set; }

/// <summary>

/// Remeber <c>ImageAsByteArray</c>? - This is where you manaually tell us what's the extension. Yes, we can inspect the signature, but why?

/// </summary>

[DataMember]

public string ImageExtension { get; set; }

/// <summary>

/// the mms message should have a subject. just put something there.

/// </summary>

[DataMember]

public string Subject { get; set; }

Page 28: Building reliable systems from unreliable components

Edge translates external structures to internal ones

public int SendMms(SendMmsEvent eventOccured)

{

var eventContext = eventOccured.ToString();

if (log.IsDebugEnabled)

log.Debug("inside 'SendMms', event context = [" + eventContext + "]");

var fromNumber = eventOccured.FromNumber;

var sender = mmsSenderFactory.Get(fromNumber);

if (null == sender)

{

if (log.IsWarnEnabled)

log.Warn("cannot get mms sender derived from '" + (fromNumber ?? "null") + "'");

return 0;

}

IMmsSubmitResponse response;

try

{

var mmsMessageDetails = new MmsMessageDetails(eventOccured.ToNumber,

eventOccured.TextAsByteArray,

eventOccured.ImageAsByteArray,

eventOccured.ImageExtension,

eventOccured.Subject);

response = sender.Submit(mmsMessageDetails);

}

catch (Exception ex)

{

log.Error("cannot send mms message, context = [" + eventContext + "]", ex);

return 0;

}

if (log.IsInfoEnabled)

{

var responseMessage = (null == response) ? "null" : response.ToString();

log.Info("sent mms with event context = [" + eventContext + "], response = [" +

responseMessage + "]");

}

Page 29: Building reliable systems from unreliable components

Sagas tie instances together for conversations

Page 30: Building reliable systems from unreliable components

SIP

Image

Extractor

Identification

Call Recovery

Page 31: Building reliable systems from unreliable components

Alternative : Orchestration

request

Workflow Engine

Workflow instance

Manage Process

route request

Host Workflows

Schedule

Orchestration platform

Service Service

reaction

Auxiliary tools

Coordinator

Protocol

Offline designer

monitor

Page 32: Building reliable systems from unreliable components

Be Wary of Nano-Services

Page 33: Building reliable systems from unreliable components

CHALLENGE - MANAGEMENT

Page 34: Building reliable systems from unreliable components

Edge

Watchdog Edge

Report

EndPoint

Service

Request

Monitor

EndPoint

Watchdog Agent

Monitor

Heal

Log

Reports

Monitor

Monitor

Monitor

Blogjecting Watchdog

Page 35: Building reliable systems from unreliable components

Blogjects concept is about collaborating objects

Page 36: Building reliable systems from unreliable components

WatchDog

Service A Service B WDWatcher

Page 37: Building reliable systems from unreliable components

SIP

Image

Extractor

Identification

Call Recovery

WatchDog

Page 38: Building reliable systems from unreliable components
Page 39: Building reliable systems from unreliable components

Status

Service Monitor Edge/Service

In Commands

Metrics collection

Policy governance

Security monitoring

Fault Monitoring Reporting &

Dashboarding

Control

Edge/Service

Status

Monitor Act

Collect

Notify

Service Monitor

Page 40: Building reliable systems from unreliable components

3 2 1 root

/

Sessions/

Abcde/

Efgh/

Resources/ Dispatchers/ Xyz/

RESTful resource management

Page 41: Building reliable systems from unreliable components

http://devrig:52141/RM/Sessions/abc/

• ATOMPUB

– Session details

• URI (ID)

• State (start/end/status etc.)

• Resources – Knows status

– URI for the Resource representation on the RM

– URI for the Resource itself

Page 43: Building reliable systems from unreliable components

SIP

Image

Extractor

Identification X

Call Recovery

WatchDog

WatchDog

Liveliness Monitor

3G Call

Page 44: Building reliable systems from unreliable components

CHALLENGE – MULTI-TENNANCY (LIES, I TELL YOU, ALL LIES)

Page 45: Building reliable systems from unreliable components

Same event different subscribers

Call Flow

Player (interaction Renderer)

Call Flow Bridge to 3rd

Party

Play Movie Event

Page 46: Building reliable systems from unreliable components

Routing [ServiceContract]

[Participate("3G")]

public interface ImPlayer : ImContract, IHandleCallStarted, IHandleCallEnded,

IHandlePlayMovie,IHandleCallAborted

{

}

[ServiceContract]

[Participate("3GPartner")]

public interface ImXsightsGateWay : ImContract,

IHandleCallAborted,

IHandlePlayMovie,

IHandleReadyForSearch,

IHandleSearchStarted,

IHandleJoinThirdParty

{

}

Page 47: Building reliable systems from unreliable components

Initiator A

Initiator B

Participant B Capacity : 1

Raise a saga initiating event

Now what ?!

Participant A Capacity : 2

Page 48: Building reliable systems from unreliable components

Reservation Pattern

Page 49: Building reliable systems from unreliable components

Event Broker

Service Host

Resource Allocator

Business Logic

Control Edge

Service Host

Service Host

Business Logic

Event Broker

Service Host

Resource Allocator

Control Edge

Service Instance #1

Service Host

Business Logic

Event Broker

Service Host

Resource Allocator

Control Edge

Service Instance #2

Service Host

Business Logic

Event Broker

Service Host

Resource Allocator

Control Edge Reservation

Page 50: Building reliable systems from unreliable components

Good old 2PC to the

rescue

Page 51: Building reliable systems from unreliable components

Takeaways

• Things break • Decouple • Fail Fast • Monitor & Detect • Compensate

• Throw state unto others

Page 52: Building reliable systems from unreliable components

Arnon Rotem-Gal-Oz

@arnonrgo http://arnon.me

http://www.cloudvalue.com