1 please switch to my laptop!!!!!!!!!!!!!!!!! don box, co-founder, developmentor 1-201
TRANSCRIPT
1
Please switch to my laptop!!!!!!!!!!!!!!!!!
Don Box, Co-founder, DevelopMentor
http://www.develop.com/dbox
1-201
2
why xml?
• XML adds type and structure to information– Information can be stored anywhere on the Internet– Data from multiple sources can be aggregated into a single
unit of information– Each piece of information has XML-specific structure– Each piece of information has application-specific type
3
what is xml?
Types and Instances
Structural Items
Elements and Attributes
Entities and Documents
Files and Packets
Sectors and Bitstreams
Classes and Objects
XML 1.0 + Namespaces
XML 1.0
XML Information Set (Infoset)
XML Schemas
Hardware Specific
Application-Specific
Con
cret
eA
bstr
act
OS/Protocol Specific
4
what is xml?
• XML is a set of specifications from the World Wide Web Consortium (W3C)– Publicly, freely available specs at http://www.w3.org/TR– Universal industry support– Anyone can contribute, comment and implement
Recommendation
Proposed Recommendation
Candidate Recommendation
Working Draft in Last Call
Working Draft in Development
Requirements
Note
Final version
Proposed Recommendation – last chance to complain
Feature Complete – time to implement
Are we done yet?
Snapshot of working group activity
Seed for working group activity
Here’s what I think…
5
what is xml?
Uniform Resource Identifiers
Extensible Markup Language 1.0
Namespaces in XML
XML Information Set (Infoset)
XML Schemas
XPointer
XPathXBase
SOAPXSLTDOML2
XML Query
6
schemas
• The Infoset defines the data types that are an XML document– Infoset types largely structural– Infoset type system non-extensible
• XML Schemas are used to define an application-specific type system in terms of the XML Infoset
7
xml schema languages
• XML Schemas are not quite a W3C candidate recommendation– Precursors include XML Data Reduced (XDR) and
Document Type Definitions (DTD)– Concepts similar in all schema languages– Vendor support for XML Schemas growing at steady pace
8
xml schema concepts
• Schemas describe a set of types and the context for their use– Type definitions define new application-domain types– Element/attribute declarations bind types to a name in some
context– Local declarations apply only within the complex type in
which they appear– Global declarations apply when no local declaration is in
effect
9
example
<schema xmlns=‘http://www.w3.org/1999/XMLSchema’ targetNamespace=‘urn:bob:bob-types’ xmlns:target=‘urn:bob:bob-types’> <complexType name=‘Robert_t’ content=‘elementOnly’ > <element name=‘count’ type=‘double’ /> <element name=‘bobbyOK’ type=‘boolean’ /> </complexType> <element name=‘bob’ type=‘target:Robert_t’ /></schema>
<b:bob xmlns:b=‘urn:bob:bob-types’> <count>23.1</count><bobbyOK>false</bobbyOK></b:bob>
10
type substitution
• XML Schemas supports OO-style inheritance– Derived type appends content model of base type– Derived type substitutable for base type– Indicate substitution using xsi:type attribute
11
type substitution
<schema xmlns=‘http://www.w3.org/1999/XMLSchema’ targetNamespace=‘[boburi]’ xmlns:t=‘[boburi]’> <complexType name=‘Person’ content=‘elementOnly’ > <element name=‘name’ type=‘string’ /> </complexType> <complexType name=‘Student’ base=‘t:Person’ derivedBy=‘extension’ > <element name=‘grade’ type=‘string’ /> </complexType> <element name=‘bob’ type=‘t:Person’ /></schema>
<b:bob xmlns:b=‘[boburi]’ xsi:type=‘b:Student’ xmlns:xsi=‘http://www.w3.org/1999/XMLSchema-instance’> <name>Don</name><age>25</age></b:bob>
12
xml messaging and soap
• SOAP is an XML messaging specification– Type mapping based on XML Schemas– Extensibility model inspired by HTTP-EF– RPC model inspired by CORBA IIOP– Submitted as a W3C note by MS, DM, IBM, Iona, Compaq,
Ariba, CommerceOne, et al
13
why not DCOM or IIOP?
• DCOM is a very connection-oriented– Many packets exchanged to set up/maintain sessions
• DCOM is not available on all platforms– MacOS, NT3.51, Vanilla Win95, WinCE2 – Never– UNIX, MVS, VMS – $$$ for port of NT codebase
• IIOP typically requires a $$$ ORB• DCOM/IIOP are sophisticated protocols that requires runtime
support to work properly– Makes porting to “other” platforms hard
• DCOM/IIOP security requires administered environment to work reliably
• DCOM/IIOP typically not usable over firewalls
14
why HTTP?
• HTTP (hypertext transfer protocol) has become the de facto protocol of the Internet
• HTTP is available on all platforms – Period!
• HTTP is a simple protocol that requires little runtime support to work properly
• HTTP is barely connection-oriented– Few/no packets exchanged to set up/maintain sessions
• HTTP security is simple yet effective
• HTTP typically the only thing usable over firewalls
15
http basics
• HTTP is a text-based request/response protocol• First line of request contains 3 elements
– Verb: POST/GET/HEAD– URI: /default.htm– Protocol Version: HTTP/1.0 HTTP/1.1
• First line of response contains 2 elements– Status Code: 200. 402– Status Phrase: OK, Unauthorized
• Subsequent lines contain arbitrary headers• “Content” follows blank header line
– Technically only for responses and POST requests
16
http basics
GET /bar/foo.txt HTTP/1.1 200 OKContent-Type: text/plainContent-Length: 12
Hello, World
POST /bar/foo.cgi HTTP/1.1Content-Type: text/plainContent-Length: 13
Goodbye, World
HTTP Request HTTP Response
or
17
why XML?
• Text-based (human authorable and readable)• Well-formed XML is as simple as it gets
– No overlapped elements– Attributes must use quotes (dir=“in”)– “<“, “>”, “&” must be escaped in strings (or use CDATA)
• XML widely adopted across platforms/vendors• XML provides a great extensibility solution using
namespaces and URIs• W3C doesn’t mandate an API but does recommend
one (the DOM)– Others in use include SAX and strcat
18
enter soap
• SOAP simply codifies existing practice of using XML and HTTP together
• SOAP is a minimal protocol for invoking methods on servers/services/components/objects
• Functionally, SOAP is closer to RPC/IIOP than to DCOM/RMI– Truer to spirit of HTTP
• SOAP was designed to make porting to platforms and technologies simple– 1st soap call should take less than an hour!!
19
soap philosophy
• “First invent no new technology”
• SOAP simply codifies existing practice of using HTTP+XML as an application protocol
• SOAP does not mandate an API or runtime
• SOAP does not mandate the use of an ORB
• SOAP does not mandate the use of a traditional web server (e.g., Apache, IIS)
• SOAP does not mandate a programming model– Although several are implied
• SOAP much closer to IIOP/GIOP than DCOM
20
the soap tightrope
• SOAP tries to sit between a variety of worlds• Balance between XML and object worlds
– Is arbitrary XML legal SOAP?
• Balance between loosely coupled and tightly coupled communications models– What assumptions can we make about the other end of the
wire?
• Balance between loosely coupled and tightly coupled type systems– C++ programmer vs. Perl scripter
• No imposition of runtime (ORB == httpd == ???)
21
what is soap?
• SOAP is (at least) two things– A XML-based serialization format– An application of that format to HTTP
• Serialization format based on element-normal-form encoding of typed instances
• When applied to HTTP, request/response pair called a SOAP method– Response payload is optional– Faults communicated via HTTP faults or SOAP:Fault
22
the soap “onion” SOAP over HTTP Mapping
Serialized Instance As Method Request
SOAP:Envelope
Element Normal Form (Section 5)
XML Schema Definition Language (opt)
XML 1.0 + Namespaces
23
three views of a soap call
• SOAP can be viewed as another ORPC protocol– Request contains in and inout parameters– Response contains inout and out parameters
• SOAP can be viewed as a “messaging” protocol– Request contains a single serialized request object– Response contains a single serialized response object
• SOAP ~= “XSLT with a longer wire”– Request contains an XML document– Server returns a transformed version
• None of these views is mandated by the protocol
24
three views of a soap call
obj.doit(arg1, arg2, arg3);
return addr
obj
arg2
arg1
arg0
arg0
arg1
arg2
return addr
thisclass doit { T arg1; T arg2; T arg3;}
<type name=‘doit’> <element name=‘arg1’ type=‘T’ /> <element name=‘arg2’ type=‘T’ /> <element name=‘arg3’ type=‘T’ /></type>
<soap:Envelope> <soap:Body> <doit xmlns=‘itfuri’> <arg0>val</arg0> <arg1>val</arg1> <arg2>val</arg2> </doit> </soap:Body></soap:Envelope>
25
soap in a nutshell
• HTTP payload consists of a soap:Envelope XML element– Well-known SOAP NSURI recommended but optional– Consists of optional Header + mandatory Body elems
• soap:Body element contains a single serialized instance of a named type– Must be first child element of soap:Body– This is the SOAP payload
• soap:Header element is a collection of serialized instances of named types– Mandatory/optional extensions
26
soap in a nutshellPOST /path/foo.pl HTTP/1.1Content-Type: text/xmlSOAPActor: interfaceURI#AddContent-Length: nnnn
<soap:Envelope xmlns:soap=‘uri for soap’> <soap:Body> <Add xmlns=‘interfaceURI’> <arg1>24</arg1> <arg2>53.2</arg2> </Add> </soap:Body></soap:Envelope>
200 OKContent-Type: text/xmlContent-Length: nnnn
<soap:Envelope xmlns:soap=‘uri for soap’> <soap:Body> <AddResponse xmlns=‘interfaceURI’ > <sum>77.2</sum> </AddResponse> </soap:Body></soap:Envelope>
27
section 5
• SOAP Section 5 describes the default encoding rules for instances of types
• Section 5 based on element-normal-form encoding– One element per field, element name is field name
• Arrays are encoded one child element per array element
• Aliased references (e.g., [ptr]) are serialized using href/id attributes
28
section 5 example
<calculateArea xmlns=‘interfaceURI’> <origin><x>23</x><y>34</y></origin> <corner><x>23</x><y>34</y></corner></calculateArea>
<Point xmlns=‘interfaceURI’> <x>23</x><y>34</y></Point>
struct Point { double x; double y;};
struct calculateArea { Point origin; Point corner;};
struct calculateArea { [ref] Point *origin; [unique] Point *corner;};
29
the soap type system
• The SOAP type system is compatible with and leverages XML Schema Definition language
• SOAP types can be described using XSD• SOAP uses XSD conventions for associating instance with
type<foo xmlns:xsi= ‘http://www.w3.org/1999/XMLSchema-instance’
xsi:type=‘timeInstant’>1999-11-12T09:43</foo>
• Arrays and typed references are given special treatment beyond XSD
30
xsd example
<calculateArea xmlns=‘interfaceURI’> <origin><x>23</x><y>34</y></origin> <corner><x>23</x><y>34</y></corner></calculateArea>
<Point xmlns=‘interfaceURI’> <x>23</x><y>34</y></Point>
<type name=‘Point’ > <element name=‘x’ type=‘double’ /> <element name=‘x’ type=‘double’ /></type><element name=‘Point’ type=‘Point’ />
<type name=‘calculateArea’ > <element name=‘origin’ type=‘Point’ /> <element name=‘corner’ type=‘Point’ nullable=‘true’ /></type><element name=‘calculateArea’ type=‘calculateArea’ />
31
embedded vs. independent values
• SOAP can retain identity across marshals in a schema-invariant manner
foo.doit(obj1, obj1);
• Values that can be referenced from multiple locations are encoded as independent elements– Must appear as child of soap:Header or soap:Body– Must have unique soap:id attribute– (Typically) encoded as element with QName of type
• Elements that refer to values that can be shared are encoded using multi-ref accessors– Uses fragment identifier in soap:href attribute
32
embedded vs. independent values
<soap:Envelope xmlns:soap=‘uri for soap’> <soap:Body> <calculateArea xmlns=‘interfaceURI’> <origin><x>23</x><y>34</y></origin> <corner><x>23</x><y>34</y></corner> </calculateArea> </soap:Body></soap:Envelope>
<soap:Envelope xmlns:soap=‘uri for soap’> <soap:Body> <calculateArea xmlns=‘interfaceURI’> <origin soap:href=‘#id1’ /> <corner soap:href=‘#id1’ /> </calculateArea> <Point soap:id=‘id1’ xmlns=‘someURI’> <x>23</x><y>34</y> </Point> </soap:Body></soap:Envelope>
33
other soap stuff
• MIME-type for SOAP is text/xml• All SOAP requests must be tagged with well-known HTTP
headerSOAPActor: interfaceURI#methodname
• HTTP faults simply use HTTP infrastructure• SOAP/app faults use distinguished SOAP PDU
– Standard content model for all faults– Extensible to support UDTs for exceptions
• Values in SOAP payloads can be tagged as using an alternative encoding scheme
<foo soap:encodingStyle=‘myuri’ > Here is some random XML!!</foo>
34
soap vs. dcom vs. corba
• Functionality-wise, SOAP is a superset of CORBA’s GIOP/IIOP– SOAP == GIOP + multi-interface objects
• Functionality-wise, SOAP is a subset of DCOM– SOAP == DCOM – pinging/garbage collection
• SOAP takes a per-method performance hit for XML encoding scheme– Using event-driven marshaler scheme helps now– Using binary encoding (e.g., WBXML) helps later
• SOAP will dominate across organization boundaries first!
35
soap and dcom requests comparedPOST /objectURI HTTP/1.1
SOAPMethodName
Extension Headers
Method Identifier
Object Endpoint ID
Interface IdentifierSOAP:Envelope
Parameter Data
SOAP:Header
SOAP:Body
CallElement
Header1
Header1 object uuid(COM Interface Pointer ID)
version+flags+msgtype
ndr_datarep
frag_length/auth_length
call_id
ORPCTHIS(protocol extensions)
Payload[in] and [in, out] params
alloc_hint
p_cont_id op_num
Note: DCE/DCOM Interface ID represented using a negotiated index (p_cont_id). Method ID represented using zero-based offset (op_num).
36
soap and dcom object references compared
Object Endpoint ID
TCP Port No
Interface Type
IP Host Address
http:// 209.110.197.2 :80 /endpointURI/MoreInfo
Interface URI passedexplicitly with every call
signature (MEOW)
flags (0x1)
iid(type of interface)
std.ipid(Interface Pointer ID)
std.flags
std.cPublicRefs
std.oxid(Logical Port Number)
std.oid(Object ID)
std.saResAddr(host names for
OXID resolver + security)
37
soap implementation techniques
• SOAP can be buried in your ORB product– Neuveau, Orbix 2000, Voyager, COM
• SOAP can be buried in your Web Server– Apache, ASP/ISAPI, JSP/Servlets/WebSphere
• Can roll your own from components– This is the approach used in DM reference implementation– [De]Serializer most critical component– Channel + Dispatcher usually pretty obvious– Transparent proxy is point of diminishing return…
38
soap implementation techniques
Legend
UrlConnectionChannel
libwwwChannel
WinInetChannel
ServletDispatcher
ASPDispatcher
ISAPIDispatcher
CGIDispatcher
ApacheDispatcher
Java[De]serializer
C++/COM[De]serializer
Perl[De]serializer
C++ C++/PerlJava Perl
39
summary
• XML layers structure and type over information
• XML Schemas define application-specific types in terms of the XML Infoset
• SOAP applies schemas to messaging and RPC
• No API is mandated!