caching xml web services to support disconnected operation
DESCRIPTION
Caching XML Web Services to Support Disconnected Operation. Venugopalan Ramasubramanian Cornell University Doug Terry Microsoft Research, Silicon Valley. Web Services. method of providing and accessing services on the Internet consumer services hotmail, orbitz, mapquest, ebay, … - PowerPoint PPT PresentationTRANSCRIPT
Caching XML Web Services to Support Disconnected Operation
Venugopalan RamasubramanianCornell University
Doug TerryMicrosoft Research, Silicon Valley
Web Services
• method of providing and accessing services on the Internet– consumer services
• hotmail, orbitz, mapquest, ebay, …– B to B services
• supply chain management
• request-response paradigm– RPCs on the internet
XML Web Services
• W3C (world wide web consortium) standards– Microsoft, IBM, HP, …– Microsoft .Net web services (HailStorm)
• mycontacts, myprofile, myfavoritewebsites
– TerraServer, CoolRooster• SOAP (simple object access protocol)
– standard representation of web service requests/responses (SOAP-RPC)
• WSDL (web services description language)– description of web services
Availability of Web Services
GOALmake web services available despite frequent
disconnections and limited bandwidth!
• web service clients reside on all kinds of devices– desktop, laptop, PDA, smart phone
• network outages (especially wireless)• bandwidth restriction
Governing Principles• cannot modify web services• cannot modify access protocols• can perhaps modify client
– must also comply with existing clients • can interpose storage and computation
client-side caching is a solution to improve availability!
XML Standards: SOAP• SOAP-RPC standard
– encoding definitions for data types – success, failure definitions
• SOAP-Envelope– outer-most element
• SOAP-Body– obligatory– request operation: name, parameters– response status: return value, failure
• SOAP-Header– optional, multiple header blocks.– supplementary information: kerberos ticket
• HTTP binding– HTTP request and response messages
example: soap request<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/”
xmlns:m=“http://schemas.microsoft.com/hs/2001/10/myContacts” xmlns:c=“http://schemas.microsoft.com/hs/2001/10/core”xmlns:mp="http://schemas.microsoft.com/hs/2001/10/myProfile" >
<s:Header><licenses xmlns="http://schemas.xmlsoap.org/soap/security/2000-12">
<c:identity> <c:kerberos>3240</c:kerberos> </c:identity></licenses><path xmlns="http://schemas.xmlsoap.org/rp/">
<action>http://schemas.microsoft.com/hs/2001/10/core#request</action><to>http://terry.microsoft.com</to><fwd><via /></fwd><rev><via /></rev><id>b55528a4-5d63-49f1-87a2-5fab8d76f658</id>
</path><c:request service="myContacts" document="content" method="insert" genResponse="always" >
<key puid="3240" instance="1" cluster="1" /></c:request>
</s:Header><s:Body>
<c:insertRequest select="/m:myContacts/m:contact[mp:name/mp:givenName = ‘Terry']/mp:emailAddress" >
<mp:email>[email protected]</mp:email></c:insertRequest>
</s:Body></s:Envelope>
XML Standards: WSDL
• concrete definition of the web service– data structures– interface offered by the web service
• operation names and parameters– message formats (components of a message)– protocol binding (SOAP)
• automatic generation of client-side stubs– Visual Studio .Net
Experiments with Web Cache
• experiment with existing clients and services (Microsoft .Net web services)
• check feasibility by building a cache to store HTTP requests/responses
MyContacts
MyServices
MyProfilecache
Issues in Caching
• web services are active– default HTTP cache directive is No Cache!
• web services are diverse– unlike files and databases, web services have custom
interfaces • fundamental questions
– which requests are cacheable?– which operations have permanent side effects?– how to understand requests/responses?
• services use different formats for requests/responses
example: soap request<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/”
xmlns:m=“http://schemas.microsoft.com/hs/2001/10/myContacts” xmlns:c=“http://schemas.microsoft.com/hs/2001/10/core”xmlns:mp="http://schemas.microsoft.com/hs/2001/10/myProfile" >
<s:Header><licenses xmlns="http://schemas.xmlsoap.org/soap/security/2000-12">
<c:identity> <c:kerberos>3240</c:kerberos> </c:identity></licenses><path xmlns="http://schemas.xmlsoap.org/rp/">
<action>http://schemas.microsoft.com/hs/2001/10/core#request</action><to>http://terry.microsoft.com</to><fwd><via /></fwd><rev><via /></rev><id>b55528a4-5d63-49f1-87a2-5fab8d76f658</id>
</path><c:request service="myContacts" document="content" method="insert" genResponse="always" >
<key puid="3240" instance="1" cluster="1" /></c:request>
</s:Header><s:Body>
<c:insertRequest select="/m:myContacts/m:contact[mp:name/mp:givenName = ‘Terry']/mp:emailAddress" >
<mp:email>[email protected]</mp:email></c:insertRequest>
</s:Body></s:Envelope>
Issues in Caching contd.
• consistency– later requests might invalidate responses cached
earlier.• read/write, write/write conflicts
– how to specify consistency requirements for generic web services?
request 1: query request
<deleteRequest select = “myContacts/contact[name=‘terry’]/phone[@cat=‘cell’]” />
request 2: delete request
<queryRequest select = “myContacts/contact[name=‘terry’]” />
More Issues…
• user experience– user unaware of web service cache– operations reportedly successful could fail!
• hoarding– keeping the cache hot– user controlled hoard requests
• security– enforce access control
Our Approach• annotate WSDL description of web services to
define cache properties– published by service providers or third party– no changes to server side code required
• transparent cache for web services– acts as a web proxy on the client machine– no modifications of the client program necessary
• custom cache managers for each web service– generated automatically from the annotated WSDL
description
CCM1
Architecture
Web Client 1
Web Client 2
ProxyServer
Cache
WebService 1
WebService 3
WebService 2
INTERNET
CCM1: Custom Cache Manager 1
CCM2
CCM3
WBQ
WBQ: Write Back Queue
WSDL Annotations: for each Operation
• cacheable: the operation can be cached• lifetime: the duration for which replies are
cached • play-back: the operation has side effects
and must be played back when connection is restored
• default-response: a default response will be sent when connection is not available
WSDL Annotations: for each Service
• identify the operation (operationName)– xpath (xml query language) expression to
extract the name of the operation• extract the request message (identifier)
– portions of the request message should be ignored while caching (date)
– xpath expression to extract relevant parts of the message for identification
<binding name="myContactsBinding" type="tns:myContactsPort"
operationName =
"substring-before(localname(/senv:Envelope/senv:Body/*[1]), 'Request')"
Identifier = "/senv:Envelope/senv:Header/s0:licenses | /senv:Envelope/senv:Header/s1:request | /senv:Envelope/senv:Body">
<s:binding transport="http://schemas.xmls.org/s/http" style="document" />
<operation name="insert" cacheable="false" playback="true" defaultResponse="true" cacheHeader="true">
<s:operation sAction="http://schemas.microsoft.com/hs/2001/10/c#request" />
snippet from annotated myContacts.wsdl
Annotations for Consistency
• when does request 2 invalidate the response of an earlier request 1 in the cache?– an insert could invalidate an earlier query response
• consider requests to be functions with signaturesreq1: op1 (param1,1, param1,2, …, param1,n)req2: op2 (param2,1, param2,2, …, param2,m)
• invalidate condition is an expression of req1 and req2
f(op1, op2, param1,1, …, param2,1, …)
Annotations for Consistency: XSL Transformations
• extensible style sheet language (XSL)– transforms XML documents in to html/text/xml– Turing-complete language
• cache transform: transforms a cached response– input: request1, reply1, request2, reply2
– output: transformed reply1 (null if invalidated)• powerful than just specifying invalidations
– can actually transform the old response
Cache Transform Example
request 1: query request
<deleteRequest select = “myContacts/contact[name=‘terry’]/phone[@cat=‘cell’]” />
request 2: delete request
<queryRequest select = “myContacts/contact[name=‘terry’]” />
smart cache transform would delete the cell phone number from the cached query response
<xsl:template match="/"> <xsl:variable name="service1" select="$req1/s:Header/c:request/@service"/> <xsl:variable name="service2" select="$req2/s:Header/c:request/@service"/> <xsl:variable name="opName1" select="substring-before(local-name($req1/s:Body/*[1]), 'Request')"/> <xsl:variable name="opName2" select="substring-before(local-name($req2/s:Body/*[1]), 'Request')"/> <xsl:choose> <xsl:when test="$service1 = $service2"> <xsl:choose> <xsl:when test="$opName2 = 'query' and ($opName1 = 'insert' or $opName1 = 'delete' or $opName1 = 'replace')"> <xsl:variable name="cleanQuery1">
<xsl:call-template name="StripSegment"> <xsl:with-param name="xpQuery" select="substring-after($req1/s:Body/c:*/@select, '/')"/></xsl:call-template>
</xsl:variable> <xsl:variable name="cleanQuery2">
<xsl:call-template name="StripSegment"> <xsl:with-param name="xpQuery" select="substring after($req2/s:Body/c:queryRequest/c:xpQuery/@select, '/')"/></xsl:call-template>
</xsl:variable> <xsl:call-template name="CheckIntersection">
<xsl:with-param name="xpQuery1" select="$cleanQuery1"/><xsl:with-param name="xpQuery2" select="$cleanQuery2"/>
</xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$rep2"/> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:value-of select="$rep2"/> </xsl:otherwise> </xsl:choose></xsl:template>
Picking Level of Consistency• user-freedom in choosing consistency
guarantees– multiple consistency transforms
• strong consistency– less availability – better user experience
• weak consistency– user experience could deteriorate
• operations reportedly successful could fail!• optional cache header
– better availability
More Transforms
• response transform– response from the cache may have to be
changed before returning to the client.– adding time-stamp, unique identifiers etc.
• default response transform– generates a default response for a request.– default responses are returned when
disconnected but request is queued for play-back
Optional Cache Header
• cache provides information to the client using cache header– response from cache or server– age of cached response– request will be played back in the future
• no changes to the definition of WSDL– would not affect existing clients in any way.
• cache aware clients can provide additional information to the user
example: default response and cache header
<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:hs="http://schemas.microsoft.com/hs/2001/10/core"><s:Header>
<path xmlns="http://schemas.xmlsoap.org/rp/"><action>http://schemas.microsoft.com/hs/2001/10/core#response</action></rev><from>http://terry.microsoft.com</from><relatesTo > d978b559-aceb-4e9e-9747-b8a306234bc8 <relatesTo>
</path>< response xmlns ="http://schemas.microsoft.com/hs/2001/10/core" /><cacheHeader defaultResponse="true" toPlayback="true"
xmlns="http://localhost/wsdlannotation" /></s:Header><s:Body>
<hs:insertResponse status="success" selectedNodeCount="1" newChangeNumber="0" /></s:Body>
</s:Envelope>
Conclusion
• built a prototype web services cache• experimented with Hailstorm web services
and clients• annotated Hailstorm WSDL files• the prototype demonstrates custom cache
managers in action for Hailstorm • couldn’t give a demo
Work for the Future
• WSDL annotations for more web services– hard to find interesting web services with
WSDL descriptions yet!• hoarding to enhance availability
– specify user controlled hoard queries– hoard transform to obtain response from
cached hoard requests• incorporate security constraints• tune cache performance