a logical look at recent messaging systems and standards...

34
A Logical Look at Recent Messaging Systems and Standards, and a Look Beyond * Steven O. Kimbrough 565 JMHH kimbrough (` a) wharton.upenn.edu 215-898-5133 29 November 2006, Amsterdam, Free University Acknowledgements: Andrew J.I. Jones, David Eyers, and previously, Yao-Hua Tan. $Id: freeunl-29112006-foils.tex,v 1.5 2006/11/28 21:37:15 sok Exp $ * File: freeunl-29112006-foils.tex/pdf.

Upload: lexuyen

Post on 26-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

A Logical Look at Recent Messaging Systemsand Standards, and a Look Beyond∗

Steven O. Kimbrough565 JMHH

kimbrough (a) wharton.upenn.edu215-898-5133

29 November 2006, Amsterdam, Free University

Acknowledgements: Andrew J.I. Jones, David Eyers, and previously, Yao-Hua Tan.

$Id: freeunl-29112006-foils.tex,v 1.5 2006/11/28 21:37:15 sok Exp $

∗File: freeunl-29112006-foils.tex/pdf.

Abstract

Messaging systems have come of age. Conceptual and technical advances have

led to the maturing of message oriented middleware (MOM) systems and to a leap

into the next generation of EDI (electronic data interchange), based upon improved

standards and conversion to XML. This talk briefly reviews the state of play and

considers the prospective convergence of MOM and EDI. Recent messaging standards

are examined through the example of ISO 20022 efforts for financial EDI. These efforts

have been impressively productive in detailing requirements and specifying usable XML

representations. Abstracting the results with logic, however, reveals both limitations in

the present approach and opportunities for improvements. The talk emphasizes one such

aspect: the support a logical point of view gives for the problem of the dynamic aspects

of message protocols, the fact that they will continue to require changes. In this regard,

the talk connects the design of message protocols with recent thinking on systems analysis

and design, which emphasizes IID, incremental and iterative development.

1

Current (Best) Practice

Two principles:

1. Proposals for change are (usually) more secure and rightly credible to theextent that they show us how to improve on current practice, especiallycurrent best practice.

2. New concepts and ideas are very often realized by abstracting andgeneralizing current practice.

It is in this way and with this spirit that the work described in this talk isundertaken. Here, focus on recent messaging standards in EDI, specificallyfinancial services messages, led by SWIFT. These represent the best ofcurrent practice and are important achievements on their own.

2

Two Developments

• MOM (Message Oriented Middleware) comes of age (beginning in the1980s)

Vendors/products include: Microsoft BizTalk, IBM WebSphere MQ, Sun,JMS, Java Message Service, TIBCO, WebMethods, SeeBeyond, Vitria,etc. (NB, JMS universal except for Microsoft)

• A new generation of messaging standards, particularly for EDI (broadlyconstrued)

Focus here: ISO, in particular ISO 20022 for financial services

3

MOM: Message Oriented Middleware

Useful, but not definitive:

http://en.wikipedia.org/wiki/Message oriented middleware

Thorough:

Enterprise Integration Patterns: Designing, Building, and DeployingMessaging Solutions (2004) by Gregor Hohpe, Bobby Woolf et al.

Relevant and useful:

The Power of Events: An Introduction to Complex Event Processingin Distributed Enterprise Systems (2002) by David Luckham

4

MOM (from the Wikipedia article)

“Message-oriented middleware comprises a category of inter-application communication

software that generally relies on asynchronous message-passing as opposed to a

request/response metaphor.

“Most message-oriented middleware (MOM) depends on a message queue system,

although some implementations rely on broadcast or on multicast messaging systems. . . .

“Middleware arrived on the computing landscape comparatively late. It emerged in the

1980s as a solution to the problem of how to link new applications to older legacy systems.

It also facilitated distributed processing the connection of multiple applications together

to create a larger application, usually over a network.. . .

“The primary advantage of a message-based communications protocol lies in its ability

to store, route or transform messages in the process of delivery.”

5

MOM (from Enterprise Integration Patterns)

“Messaging is a technology that enables high-speed, asynchronous, program-to-

program communication with reliable delivery. Programs communicate by sending packets

of data called messages to each other. Channels, also known as queues, are logical

pathways that connect the programs and convey messages. . . .

“The message itself is simply some sort of data structure—such as a string, a byte

array, a record, or an object. It can be interpreted simply as data, as the description of a

command to be invoked on the receiver, or as the description of an event that occurred in

the sender.” [page xxxi]

“A messaging system is needed to move messages from one computer to another

because computers and the networks that connect them are inherently unreliable.” [xxxii]

Message semantics (and other little details) are (properly) outside the scopeof MOMs.

6

MOM (from Enterprise Integration Patterns)

• Emphasize: Messaging systems are for asynchronous communication

• Aka: “send and forget”

NB. (not quite, but. . . user can track but does not wait)

• NB. MOMs do not track at the application level for the users

E.g., dialog management is outside the scope of MOMs

7

MOM (from Enterprise Integration Patterns)

On design tools and methods:

“To our knowledge, there is no widely used, comprehensive notation that is geared

toward the description of all aspects of an integration solution. The Unified Modeling

Language (UML) does a fine job of describing object-oriented systems with class and

interaction diagrams, but it does not contain semantics to describe messaging solutions.

The UML Profile for EAI . . . enriches the semantics of collaboration diagrams to describe

message flows between components. This notation is very useful as a precise visual

specification that can serve as the basis for code generation as part of a model-driven

architecture (MDA). We decided not to adopt this notation . . . ” [page xliv]

Remember: This pertains to MOMs and messaging systems, NOT(even!) to the semantics or constitution of the message protocols/contentsthemselves. There is much that needs doing.

8

Our Interest

• Is focused on the design of the messages themselves and on theirprocessing (creation, interpretation, dialogs).

• MOMs add value in practice, they enable and facilitate (potentially atleast) COAs (communication oriented applications— ;) ).

• EDI standardization is an instance par excellence

• We’ll look in particular at ISO 20022, UNIFI, “UNIversal FinancialIndustry message scheme”

http://www.iso20022.org/ and especially

http://www.iso20022.org/index.cfm?item id=42790

9

But first. . .

• “Introduction to ISO 20022 — UNIversal Financial Industry messagescheme” at www.iso20022.org,

also Scripted-wwwiso20022org-20061106.ppt

10

Comments

• Focus, wlog, on “Payments Standards – Clearing and Settlement”

http://www.iso20022.org/index.cfm?item id=60053

• Key elements: Diagramming of business contexts and processes; inmessage protocol (i) fixing reference, (ii) message structure (header +body), (iii) constraints on header-body combinations.

11

Critique

This in many ways represents solid progress and the codification of muchcareful thought. So what’s not to like?

Answer:

1. Change is cumbersome: “the dynamic point”

2. Limited scope of constraints: “the constraint generality point”

3. Need for a more robust and extensive account of message meaning: “thesemantics and inference point”

12

The dynamic point, part 1: fixing reference

See: “On Representing Special Languages with FLBC: Message Markersand Reference Fixing in SeaSpeak,” by Steven O. Kimbrough, Yinghui(Catherine) Yang, FMEC book, pp. 297f.

The 567 page specification document, “Payments Standards – Clearing andSettlement,” devotes about 55 pages to specifying how to make referenceto various sorts of things.

. . . And that’s not all.

13

Ways of referring

1. With names, whose meaning is conventionally assigned; proper names.

2. With data structures, whose components are combined. E.g., Surname,given name; dates and time stamps

3. By ostension, pointing. See http://en.wikipedia.org/wiki/Ostensive definition

4. With demonstratives: this, that, these, those

5. By description; Russell and definite description

(1) and (2) only in ISO 20022. NB. Importance of events and naming them.

14

ISODate, p. 447

An unexceptional case:

2.1 ISODate

Definition: Date within a particular calendar year represented byYYYY-MM-DD (ISO 8601).

Example: 2002-02-25

Note:

• Use of an exogenous convention

• Implicit agreement [apparently] in ISO 20022 to use that convention,even if its contents change

15

BICs, page 463: 2.1.3 BIC <BIC>

Presence: [1..1]This message item is part of choice 2.1.2 FinancialInstitutionIdentification.Definition: Bank Identifier Code. Code allocated to financial institutionsby the Registration Authority, under an international identification scheme,as described in the latest version of the standard ISO 9362 Banking(Banking telecommunication messages, Bank Identifier Codes).Data Type: BICIdentifierFormat: [A-Z]{6,6}[A-Z2-9][A-NP-Z0-9]([A-Z0-9]{3,3}){0,1}Rule(s): BICValid BICs are registered with the ISO 9362 Registration Authority, andconsist of eight (8) or eleven (11) contiguous characters comprising thefirst three or all four of the following components: BANK CODE,COUNTRY CODE, LOCATION CODE, BRANCH CODE. The bank code, country codeand location code are mandatory, while the branch code is optional.

16

Comments on BIC

Note:

• Use of an exogenous convention

• Implicit agreement [apparently] in ISO 20022 to use that convention,even it its contents change

So, this is also good. But, what if there is a need to communicate with orabout a financial institution that doesn’t have, or yet have, or can neverhave a BIC?

17

FinancialInstitutionIdentification, page 463

Definition: Unique and unambiguous identifier of a financial institution,as assigned under an internationally recognised or proprietary identificationscheme. Type: This message item is composed of one of the followingFinancialInstitutionIdentification5Choice element(s):

Ref Or Message Item <XML Tag> Mult. Represent./

Type

2.1.3 {Or BIC <BIC> [1..1] Identifier

2.1.4 Or ClearingSystemMember-

Identification <ClrSysMmbId> [1..1]

2.1.7 Or NameAndAddress <NmAndAdr> [1..1]

2.1.18 Or ProprietaryIdentification <PrtryId> [1..1]

2.1.21 Or} CombinedIdentification <CmbndId> [1..1]

18

ProprietaryIdentification, page 466

2.1.18 ProprietaryIdentification <PrtryId>Presence: [1..1]This message item is part of choice 2.1.2FinancialInstitutionIdentification.Definition: Unique and unambiguous identifier, as assigned to afinancial institution using a proprietary identification scheme.Type: This message item is composed of the followingGenericIdentification3 element(s):

Ref Or Message Item <XML Tag> Mult. Represent./

Type

2.1.19 Identification <Id> [1..1] Text

2.1.20 Issuer <Issr> [0..1] Text

19

Comments

• GenericIdentification3???!!! (Such numbering appears throughoutthe standards.)

• See also: Page 464, 2.1.6 Proprietary.

• Back to FinancialInstitutionIdentification. Notice it’s a disjunction. Whyinsist on identifying all of the disjuncts? Why insist on either (i) goingback to the standards process or (ii) using a proprietary code (and allthe mess that invites)?

But, what’s the alternative?

20

Abstract and extend the concept of a protocol

• What’s needed is a standard for extending the standard. Concept:

protocol (or convention) = kernel protocol + rules of extension

• How might this work? Lots of ways. Whenever two communicants wantto add a new ID, there is identified a standard way of publishing it, sayon the Web, and accessing it and its definition, e.g., with a URI. (Hello,XML!)

What would it take to make this work? (discuss)

21

Comments (con’t.)

• There are lots of examples like this in the naming portion of the ISO20022 standards (and even in the current document). Implementing theabstraction (kernel + rules of extension) for naming would have extensiveeffects.

• Now let’s look elsewhere in the standards, at the message formationrules.

Wlog, I’ll focus on just one message type: FIToFICustomerCreditTransferV01.

22

The dynamic point, part 2: changing structure

FIToFICustomerCreditTransferV01, pp. 42f.:

“The FinancialInstitutionToFinancialInstitutionCustomerCreditTransfermessage is sent by the debtor agent to the creditor agent, directlyor through other agents and/or a payment clearing and settlementsystem. It is used to move funds from a debtor account to a creditor.”

23

Structure of a FIToFICustomerCreditTransferV01message

Outline

The FIToFICustomerCreditTransfer message is composed of 2 building blocks:

A. Group Header

This building block is mandatory and present once. It contains elements such as

MessageIdentification, CreationDateAndTime.

B. Credit Transfer Transaction Information

This building block is mandatory and [may be] repetitive. It contains elements

related to the debit and credit side of the transaction such as Creditor, CreditorAgent,

Debtor, DebtorAgent.

24

Sizes of the 2 building blocks, as specified

• A. Group Header. 31 “message items”. Many optional. Some complex(containing multiple message items).

• B. Credit Transfer Transaction Information. 95 “message items”. Manyoptional. Some complex (containing multiple message items).

Known problems, based on past experience: (i) Even this many items isunlikely to prove complete, (ii) To avoid having to go back to the standardsbodies and having the burden of delay, users will typically use a messageitem (say one containing free text) for other than its intended purpose.XML doesn’t address this issue, and can’t.

25

There are naming issues as above, e.g. ClearingSystem,pp. 50f.

1.11 ClearingSystem <ClrSys>

Presence: [0..1]

Definition: Specification of a pre-agreed offering between clearingagents or the channel through which the payment instruction is processed.Type: This message item is composed of one of the followingClearingSystemIdentification1Choice element(s):

Ref Or Message Item <XML Tag> Mult. Represent./

Type

1.12 {Or ClearingSystemIdentification <ClrSysId> [1..1] Code

1.13 Or} Proprietary <Prtry> [1..1] Text

26

OK, ClearingSystemIdentification? Pp. 51f.

1.12 ClearingSystemIdentification <ClrSysId>

Presence: [1..1]

This message item is part of choice 1.11 ClearingSystem.

Definition: Infrastructure through which the payment instruction isprocessed.

Data Type: Code

One of the following CashClearingSystem3Code values must be used:

[And there follows a 3 page table of country-related codes. Absencesinclude Iraq, Lebanon, Israel. . . And if there are additions?]

27

Examples. InstructionPriorityInstructionPriority, p. 56.Why just 2? See 1.22–1.24 for repetition of the mistake.

1.21 InstructionPriority <InstrPrty>

Presence: [0..1]

Definition: Indicator of the urgency or order of importance that the instructing party

would like the instructed party to apply to the processing of the instruction.

Data Type: Code

When this message item is present, one of the following Priority2Code values must be

used:

Code Name Definition

HIGH High Priority level is high.

NORM Normal Priority level is normal.

28

What can be done?

• Much as before with fixing reference: abstract and define a broaderconcept of a protocol.

29

Abstract and extend the concept of a protocol

• What’s needed is a standard for extending the standard. Concept:

protocol (or convention) = kernel protocol + rules of extension

• How might this work? Lots of ways. Whenever two communicants want to add a new

message item [name], there is identified a standard way of publishing it, say on the

Web, and accessing it and its definition, e.g., with a URI. (Hello, XML!)

What would it take to make this work? (discuss)

Smallish practical point: No longer can we rely on strict positioning of message items.

(Why?). This is hardly an insurmountable problem (although it would make heavier

processing demands). One approach is to use “manifest types” or something like that:

a message item outside the kernel could be presented with its defining URI preceding

it. (Hello, XML and namespaces!)

30

The constraint generality point (our 2nd of 3)

The FIToFICustomerCreditTransferV01 message definition comes with“rules” that define syntactic constraints (pp. 42f.). Examples:

InstructedAgentRule

If GroupHeader/InstructedAgent is present, thenCreditTransferTransactionInformation/InstructedAgent is not allowed.

InstructingAgentRule

If GroupHeader/InstructingAgent is present, thenCreditTransferTransactionInformation/InstructingAgent is not allowed.

(There are dozens of these.)

31

Briefly. . . (this is a huge topic)

• All of the constraints are hard, exceptions are all “not allowed”. Why beso limited? Why not soft constraints, warnings that can accumulate?

• All of the constraints are syntactically oriented? Why not flag businessrules, diagnostics, etc.?

• All of the constraints are within-message only? Why be so limited?

• In short, why not provide for a general purpose constraint-expressing and-processing facility?

32

Finally and briefly, the semantics and inference point(3rd of 3)

• The protocol provides explicit semantic (meaning, interpretation)information primarily through its efforts at reference fixing.

• The basic message structure is:

H1,H2, . . . ,Hm[B1,1, B1,2, . . . , B1,B1; . . . ;Bm,1, Bm,2, . . . , Bm,Bm]

• Implicitly, this is our old friend F (C) from speech act theory.

Points arising: (i) FLBC should be able to represent these messages easily;(ii) Having in hand a logical representation (as in FLBC), articulating apowerful and general way of expressing constraints is a natural next step.

33