tools for automated verification of web services
DESCRIPTION
Modeling Interactions of Web Software. Tools for Automated Verification of Web Services. Analyzing Conversations of Web Services. Tevfik Bultan Department of Computer Science University of California, Santa Barbara [email protected] http://www.cs.ucsb.edu/~bultan Joint work with - PowerPoint PPT PresentationTRANSCRIPT
Tools for Automated Verification of Web Services
Tevfik BultanDepartment of Computer Science
University of California, Santa [email protected]
http://www.cs.ucsb.edu/~bultan
Joint work withXiang Fu, Georgia Southwestern State University
Jianwen Su, University of California, Santa Barbara
Modeling Interactions of Web Software
Analyzing Conversations of Web Services
Going to Lunch at UCSB
• Before Xiang graduated from UCSB, Xiang, Jianwen and I were using the following protocol for going to lunch:– Sometime around noon one of us would call another
one by phone and tell him where and when we would meet for lunch.
– The receiver of this first call would call the remaining peer and pass the information.
• Let’s call this protocol the First Caller Decides (FCD) protocol.
!tj1
?xt1
!tx1
?jt1
!tx2
Tevfik
!tj2
?jt2
?xt2
!xj1
?tx1
!xt1
?jx1
!xt2
Xiang
!xj2
?jx2
?tx2
!jt1
?xj1
!jx1
?tj1
!jx2
Jianwen
!jt2
?tj2
?xj2
Implementation of the FCD Protocol
t x 1
from Tevfik
toXiang
1st message
Message Labels: ! send
? receive
FCD Protocol does not Work with Voicemail
• When the university installed a voicemail system FCD protocol started causing problems– We were showing up at different restaurants at different
times!
• Example scenario: tx1, jx1, xj2
The messages jx1 and xj2 are not consumed
– Note that this scenario is not possible without voicemail!
A Different Lunch Protocol
• Jianwen suggested that we change our lunch protocol as follows:– As the most senior researcher among us Jianwen
would make the first call to either Xiang or Tevfik and tell when and where we would meet for lunch.
– Then, the receiver of this call would pass the information to the other peer.
– Let’s call this protocol the Jianwen Decides (JD) protocol
?xt?jt
!tx
Tevfik Xiang Jianwen
?tx?jx
!xt
!jt !jx
Implementation of the JD Protocol
• JD protocol works fine with voicemail!
Conversation Protocols
• The FCD and JD protocols specify a set of conversations
• The implementations I showed are supposed to generate the set of conversations specified by these protocols
• We can specify the set of conversations without showing how the peers implement them, we call such a specification a conversation protocol
tj1tx1
xj2
xt1 xj1 jt1 jx1
jx2 tj2 jt2 tx2 xt2
FCD Protocol
jt jx
tx xt
JD Protocol
FCD and JD Conversation Protocols
Conversation set:{(tx1, xj2), (tj1, jx2), (xt1, tj2), (xj1, jt2), (jt1, tx2), (jx1, xt2)}
Conversation set:{(jt, tx), (jx, xt)}
Observations & Questions
• The implementation of the FCD protocol behaves differently with synchronous and asynchronous communication whereas the implementation of the JD protocol behaves the same. – Can we find a way to identify such implementations?
• The implementation of the FCD protocol does not obey the FCD protocol if asynchronous communication is used whereas the implementation of the JD protocol obeys the JD protocol even if asynchronous communication used.– Given a conversation protocol can we figure out if there
is an implementation which generates the same conversation set?
Synchronizability and Realizability Analyses
• We formalized these observations and questions using synchronizability and realizability analyses
– The implementation of the JD protocol is synchronizable but the implementation of the FCD protocol is not synchronizable
– The JD protocol is realizable but the FCD protocol is not realizable
Outline
• Web Service Composition Model • Capturing Global Behaviors
– Conversations• Top-Down vs. Bottom-Up Specification and Verification
– Realizability vs. Synchronizability• XML messaging
– MSL, XPath– Translation to Promela
• Web Service Analysis Tool• Conclusions and Future Work
Characteristics of Web Services
• Loosely coupled, interaction through standardized interfaces
• Standardized data transmission via XML• Asynchronous messaging• Platform independent (.NET, J2EE)
Data
Type
Interface
Behavior
Message
BPEL4WS
Web Service Standards
Impl
emen
tatio
n P
latf
orm
s
Mic
roso
ft .
Net
, S
un J
2EE
WSDL
SOAP
XML Schema
XML
WS-CDLInteraction
Challenges in Verification of Web Services
• Distributed nature, no central control– How do we model the global behavior?– How do we specify the global properties?
• Asynchronous messaging introduces undecidability in analysis– How do we check the global behavior?– How do we enforce the global behavior?
• XML data manipulation– How do we specify the XML messages?– How do we verify properties related to data?
A Model for Composite Web Services
tx
xt
jxjt
Peer T
Peer J
Peer X
• A composite web service consists of– a finite set of peers
• Lunch example: T, X, J– and a finite set of message classes
• Lunch example (JD protocol): jt, tx, jx, xt
Communication Model
• We assume that the messages among the peers are exchanged using reliable and asynchronous messaging– FIFO and unbounded message queues
• This model is similar to industry efforts such as
– JMS (Java Message Service)
– MSMQ (Microsoft Message Queuing Service)
txPeer T Peer Xtx
Conversations
• A virtual watcher records the messages as they are sent
Watcher
• A conversation is a sequence of messages the watcher sees during an execution
[Bultan, Fu, Hull, Su WWW’03]
tx
jt
Peer T
Peer J
Peer X
txjt
Effects of Asynchronous Communication
• Question: Given a composite web service, is the set of conversations a regular set?
• Even when messages do not have any content and the peers are finite state machines the conversation set may not be regular
• Reason: asynchronous communication with unbounded queues
• Bounded queues or synchronous communication
Conversation set always regular
Properties of Conversations
• The notion of conversation enables us to reason about temporal properties of the composite web services
• LTL framework extends naturally to conversations– LTL temporal operators
X (neXt), U (Until), G (Globally), F (Future)– Atomic properties
Predicates on message classes (or contents)
Example: G ( payment F receipt )
• Model checking problem: Given an LTL property, does the conversation set satisfy the property?
Bottom-Up vs. Top-Down
Bottom-up approach• Specify the behavior of each peer• The global communication behavior (conversation set) is
implicitly defined based on the composed behavior of the peers
• Global communication behavior is hard to understand and analyze
Top-down approach• Specify the global communication behavior (conversation
set) explicitly as a protocol• Ensure that the conversations generated by the peers
obey the protocol
ConversationProtocol GF(tx xt))
? LTL property
Peer T
Peer J
Peer X
jt
txxt
jx
ConversationSchema
InputQueue
...Virtual Watcher GF(tx xt))?
LTL property
?xt?jt
!tx
Peer T Peer X
?tx?jx
!xt
Peer J
!jt !jx
jt jx
tx xt
Conversation Protocols
• Conversation Protocol: – An automaton that accepts the desired conversation set
• A conversation protocol is a contract agreed by all peers– Each peer must act according to the protocol
• For reactive protocols with infinite message sequences we use:– Büchi automata which accept infinite strings
• For specifying message contents, we use:– Guarded automata– Guards are constraints on the message contents
Synthesize Peer Implementations
• Conversation protocol specifies the global communication behavior– How do we implement the peers?
• How do we obtain the contracts that peers have to obey from the global contract specified by the conversation protocol?
• Project the global protocol to each peer– By dropping unrelated messages for each peer
Interesting Question
If this equality holds the conversation protocol is realizable
Are there conditions which ensure the equivalence?
Conversations generated by the projected services
Conversations specified by the conversation protocol
?
Realizability Problem
• Not all conversation protocols are realizable!
AB: m1
CD: m2
Conversation protocol
Conversation “m2 m1m2 m1” will be generated by all peer implementations which follow the protocol
!m1 ?m1 !m2 ?m2
Peer A Peer B Peer C Peer D
Projection of the conversation protocol to the peers
Another Non-Realizable Protocol
m3
m1
m2
m2 m1 m3
m1
m2
m3AB: m1BA: m2
AC: m3
BA: m2
AB: m1
A
B
C
m1m2
m3
Watcher
A B
C
Generated conversation:
B A, C
Realizability Conditions
Three sufficient conditions for realizability (no message content) [Fu, Bultan, Su, CIAA’03, TCS’04]
• Lossless join– Conversation set should be equivalent to the join of its
projections to each peer• Synchronous compatible
– When the projections are composed synchronously, there should not be a state where a peer is ready to send a message while the corresponding receiver is not ready to receive
• Autonomous– At any state, each peer should be able to do only one of the
following: send, receive or terminate
(a peer can still choose among multiple messages)
Realizability Conditions
AB: m1
CD: m2
AB: m1BA: m2
AC: m3
BA: m2
AB: m1
• Following protocols fail one of the three conditions but satisfy the other two
Not lossless join
Not autonomous
AB: m1
CA: m2
Not synchronous compatible
Bottom-Up Approach
• We know that analyzing conversations of composite web services is difficult due to asynchronous communication– Model checking for conversation properties is
undecidable even for finite state peers
• The question is:– Can we identify the composite web services where
asynchronous communication does not create a problem?
Three Examples, Example 1
requester server
!r2
?a1 ?a2
!e
!r1
• Conversation set is regular: (Conversation set is regular: (rr11aa11 | | rr22aa22)* )* ee
• During all executions the message queues are bounded
r1, r2
a1, a2
e ?r1
!a1 !a2
?r2
?e
Example 2
• Conversation set is not regularConversation set is not regular• Queues are not bounded
requester server
!r2
?a1 ?a2
!e
!r1
r1, r2
a1, a2
e ?r1
!a1 !a2
?r2
?e
Example 3
• Conversation set is regular: (Conversation set is regular: (rr11 | | rr22 | | rara)* )* ee
• Queues are not bounded
requester server
!r2
?a !r
!e!r1
r1, r2
a1, a2
e
?r1 ?r2
?e
?r !a
State Spaces of the Three Examples
0
200
400
600
800
1000
1200
1400
1600
1 3 5 7 9 11 13
Example 1
Example 2
Example 3
queue length
# o
f st
ates
in
th
ou
san
ds
• Verification of Examples 2 and 3 are difficult even if we bound the queue length
• How can we distinguish Examples 1 and 3 (with regular conversation sets) from 2?
– Synchronizability Analysis
Synchronizability Analysis
• A composite web service is synchronizable, if its conversation set does not change – when asynchronous communication is replaced with
synchronous communication
• If a composite web service is synchronizable we can check the properties about its conversations using synchronous communication semantics – For finite state peers this is a finite state model
checking problem
Synchronizability Analysis
• A composite web service is synchronizable, if it satisfies the synchronous compatible and autonomous conditions
[Fu, Bultan, Su WWW’04, TSE]
• Connection between realizability and synchronizability:– A conversation protocol is realizable if its projections to
peers are synchronizable and the protocol itself satisfies the lossless join condition
Are These Conditions Too Restrictive?
Problem Set Size Pass?Source Name #msg #states #trans.
ISSTA’04 SAS 9 12 15 yes
IBM
Conv.
Support
Project
CvSetup 4 4 4 yesMetaConv 4 4 6 no
Chat 2 4 5 yesBuy 5 5 6 yes
Haggle 8 5 8 noAMAB 8 10 15 yes
BPEL
spec
shipping 2 3 3 yesLoan 6 6 6 yes
Auction 9 9 10 yesCollaxa.
com
StarLoan 6 7 7 yesCauction 5 7 6 yes
BPEL to
GFSAGuardedautomata
GFSA to Promela (bounded queue)
BPEL
WebServices
Promela
SynchronizabilityAnalysis
GFSA to Promela(synchronous
communication)
IntermediateRepresentation
ConversationProtocol
Front End
Realizability Analysis
Guardedautomaton
skip
GFSAparser
success
fail
GFSA to Promela(single process,
no communication)
success
fail
Analysis Back End
(bottom-up)
(top-down)
Verification Languages
Web Service Analysis Tool (WSAT)
[Fu, Bultan, Su CAV’04]
http://www.cs.ucsb.edu/~su/WSAT/
Guarded Automata Model
• Uses XML messages
• Uses MSL for declaring message types– MSL (Model Schema Language) is a compact formal
model language which captures core features of XML Schema
• Uses XPath expressions for guards– XPath is a language for writing expressions (queries)
that navigate through XML trees and return a set of answer nodes
The Guarded Automata Model
//type declarationrequest [ id [int]]
// message declaration r2: request
// local variable declarationlast: request
Guard{ a2/id = last/id => r2/id := last/id + 1, last/id := last/id + 1}
!r2
?a1 ?a2
!e
!r1
XML (eXtensible Markup Language)
• XML is a markup language like HTML
• Similar to HTML, XML tags are written as
<tag> followed by </tag>
• HTML vs. XML– In HTML, tags are used to describe the appearance of
the data
<b> </b> <i> </i> <br> <p> ...– In XML, tags are used to describe the content of the
data rather than the appearance
<date> </date> <address> </address>
An XML Document and Its Tree
<Register><investorID>VIP01</investorID><requestList><stockID>0001</stockID><stockID>0002</stockID></requestList><payment><accountNum>0425</accountNum></payment></Register>
investorID
Register
VIP01
requestList
0001 0002
payment
accountNum
0425
stockID stockID
• XML documents can be modeled as trees where each internal node corresponds to a tag and leaf nodes correspond to basic types
XML Schema
• XML provides a standard way to exchange data over the Internet.
• However, the parties which exchange XML documents still have to agree on the type of the data – What are the tags that will appear in the document, in
what order, etc.
• XML Schema is a language for defining XML data types
• MSL (Model Schema Language) is a compact formal model language which captures core features of XML Schema
MSL (Model Schema Language)
• Basic MSL syntax
g | b | t [ g ] | g { m , n }
| g , g | g & g | g | g
g is an XML type (i.e., an MSL type expression)
is the empty sequence
b is a basic type such as string, boolean, int, etc.t is a tag m and n are positive integers[ ] { } & , | are MSL type constructors
MSL Semantics
• t [ g ] denotes a type with root node labeled t with children of type g
• g { m , n } denotes a sequence of size at least m and at most n where each member is of type g
• g1 , g2 denotes an ordered sequence where the first member is of type g1 and the second member is of type g2
• g1 & g2 denotes an unordered sequence where one member is of type g1 and the other member is of type g2
• g1 | g2 denotes a choice between type g1 and type g2, i.e., either type g1 or type g2, but not both
An MSL Type Declaration and an Instance
Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ]]
<Register><investorID>VIP01</investorID><requestList><stockID>0001</stockID><stockID>0002</stockID></requestList><payment><accountNum>0425</accountNum></payment></Register>
Translating Guarded Automata to Promela
• We used the SPIN model checker to verify the properties of conversations
• SPIN is a finite state model checker– we restricted XML message contents to finite domains
• We translate guarded automata models to Promela (input language of the SPIN model checker)– First, translate MSL type declarations to Promela type
declarations– Then, translate XPath expressions to Promela code
Mapping MSL types to Promela
• Basic types – integer and boolean types are mapped to Promela
basic types int and bool – We only allow constant string values and strings are
mapped to enumerated type (mtype) in Promela
• Other type constructors are handled using – structured types (declared using typedef) in Promela– or arrays
Mapping MSL type constructors to Promela
• t [ g ] is translated to a typedef declaration
• g { m , n } is translated to an array declaration
• g1 , g2 is translated to a sequence of type declarations
• g1 | g2 is translated to a sequence of type declarations and an enumerated variable which is used to record which type is chosen
• g1 & g2 is not handled! We do not handle unordered type sequence (it can cause state-space explosion)
Example
Register[ investorID[string] , requestList[ stockID[int]{1,3} ] , payment[ creditCardNum[int] | accountNum[int] ]]
typedef t1_investorID{ mtype stringvalue;}typedef t2_stockID{int intvalue;}typedef t3_requestList{ t2_stockID stockID [3]; int stockID_occ;}typedef t4_accountNum{int intvalue;}typedef t5_creditCard{int intvalue;}mtype {m_accountNum, m_creditCard}typedef t6_payment{ t4_accountNum accountNum; t5_creditCard creditCard; mtype choice;}typedef Register{ t1_investorID investorID; t3_requestList requestList; t6_payment payment;}
XPath
• In order to write specifications or programs that manipulate XML documents we need: – an expression language to access values and nodes in
XML documents
• XPath is a language for writing expressions (queries) that navigate through XML trees and return a set of answer nodes
• An XPath query defines a function which – takes and XML tree and a context node (in the same
tree) as input and – returns a set of nodes (in the same tree) as output
XPath Syntax
Basic XPath syntax:
q . | .. | b | t | *
| /q | //q | q / q | q // q
| q [ q ] | q [ exp ]
q is an XPath query
exp denotes a predicate on basic types, i.e., on the leaf nodes of the XML tree
b denotes a basic type such as string, boolean, int, etc.
t denotes a tag
XPath Semantics
Given an XML tree and a node n as a context node
. returns n
.. returns the parent of n
Given an XML tree and a set of nodes
* returns all the nodes
b returns the nodes that are of basic type b
t returns the nodes which are labeled with tag t
XPath Semantics Contd.
Starting at the context node• /q returns the nodes that match q• //q returns the nodes that match q starting at any
descendant
• q1 / q2 returns each node which matches q2 starting at a child of a node which matches q1
• q1 // q2 returns each node which matches q2 starting at a descendant of a node which matches q1
• q1 [ q2 ] applies q2 to the children of the nodes which match q1
• q [ exp ] returns the nodes that match q and for children of which the expression exp evaluates to true
Examples
//payment/* returns the node labeled accountNum
/Register/requestList/stockID/int returns the nodes labeled 0001 and 0002
//stockID[int > 1]/int returns the node labeled 0002
investorID
Register
VIP01
requestList
0001 0002
payment
accountNum
0425
stockID stockID
XPath to Promela
• Generate code that evaluates the XPath expression
[Fu, Bultan, Su ISSTA’04]
• Traverse the XPath expression from left to right– Code generated in each step is inserted into the BLANK
spaces left in the code from the previous step– A tree representation of the MSL type is used to keep track
of the context of the generated code
• Uses two data structures– Type tree shows the structure of the corresponding MSL
type– Abstract statements which are mapped to Promela code
IF(v)if :: v -> BLANK :: else -> skipfi
v = l – 1do :: v < h -> BLANK v++ :: else -> breakod
BLANK
FOR(v,l,h)
EMPTY
INC(v)
SET(v,a)
v++
v = a
Statement Promela Code
investorID
Register
string
requestList
int
payment
creditCard
int
stockID (idx: i1)
accountNum
int
1
2
3
4
108
7
5
6
9 11
Register[ investorID[string] & requestList[ stockID[int]{1,3} ] & payment[ creditCardNum[int] | accountNum[int] ]]
Type Tree
FOR (i1,1,3)
EMPTY
IF (cond)
SET (bRes1,0)
IF (bRes1)
IF (i2==i3)
IF (bRes2) EMPTY
SET (bRes2,0)
SET (bRes2,1)
SET (bRes1,1)
$register // stockID / [int()>5] / [position() = = last()]/ int()
cond v_register.requestlist.stockID[i1] > 5Sequence
Insert
1 5
5 5
5 5 5 55 6
Generated Statements
$request//stockID=$register//stockID[int()>5][position()=last()]
/* result of the XPath expression */ bool bResult = false; /* results of the predicates 1, 2, and 1 resp. */ bool bRes1, bRes2, bRes3; /* index, position(), last(), index, position() */ int i1, i2, i3, i4, i5;
i2=1; /* pre-calculate the value of last(), store in i3 */ i4=0; i5=1; i3=0; do :: i4 < v_register.requestList.stockID_occ -> /* compute first predicate */ bRes3 = false; if :: v_register.requestList.stockID[i4].intvalue>5 -> bRes3 = true :: else -> skip fi; if :: bRes3 -> i5++; i3++; :: else -> skip fi; i4++;
:: else -> break; od;
$request//stockID=$register//stockID[int()>5][position()=last()]
i1=0; do :: i1 < v_register.requestList.stockID_occ -> bRes1 = false; if :: v_register.requestList.stockID[i1].intvalue>5 -> bRes1 = true :: else -> skip fi; if :: bRes1 -> bRes2 = false; if :: (i2 == i3) -> bRes2 = true; :: else -> skip fi; if :: bRes2 -> if :: (v_request.stockID.intvalue == v_register.requestList.stockID[i1].intvalue) -> bResult = true; :: else -> skip fi :: else -> skip fi; i2++; :: else -> skip fi; i1++; :: else -> break; od;
Model Checking Using Promela
• Found subtle errors in an example– SAS: Stock Analysis Service [Fu, Bultan, Su ISSTA’04]– 3 peers: Investor, Broker, ResearchDept.– Investor Broker: a registerList of stockIDs– Broker ResearchDept.:
• relay request (1 stockID per request)• find the stockID in the latest request, send its
subsequent stockID in registerList– Repeating stockID will cause error.– Only discoverable by analysis of XPath expressions
Related Work
• Conversation specification– IBM Conversation support project
http://www.research.ibm.com/convsupport/– Conversation support for business process integration
[Hanson, Nandi, Kumaran EDOCC’02]– Orchestrating computations on the world-wide web
[Choi, Garg, Rai, Misram, Vin EuroPar’02]
• Realizability problem– Realizability of Message Sequence Charts (MSC) [Alur,
Etassami, Yannakakis ICSE’00, ICALP’01]
Related Work
• Verification of web services– Simulation, verification, composition of web services
using a Petri net model [Narayanan, McIlraith WWW’02]
– BPEL verification using a process algebra model and Concurrency Workbench [Koshkina, van Breugel TAV-WEB’03]
– Using MSC to model BPEL web services which are translated to labeled transition systems and verified using model checking [Foster, Uchitel, Magee, Kramer ASE’03]
– Model checking Web Service Flow Language specifications using SPIN [Nakajima ICWE’04]
Current and Future Work
• Extending the source and target languages
• Symbolic analysis
[Fu, Bultan, Su ICWS’04, JWSR]
• Abstraction
• Design for verification for web services
[Betin-Can, Bultan WWW’05, ICWS’05]
Translatorfor bottom-upspecifications Guarded
automata Translation withbounded queue
SynchronizabilityAnalysis
Translation withsynchronous
communication
IntermediateRepresentation
ConversationProtocols
Front End
Realizability Analysis
Guardedautomaton
skip
Translatorfor top-downspecifications
success
fail
Translation withsingle process,
no communication
success
fail
Analysis Back End
BPEL
Web ServiceSpecificationLanguages
DAML-S
WS-CDL
Promela
SMV
ActionLanguage
VerificationLanguages
. . .
. . .
Aut
omat
ed
Abs
trac
tion
Current and Future Work
THE END