The Argument Against SOAP Encoding

 

Tim Ewald
Microsoft Corporation

October 2002

Applies to:
   Web Services Specifications (SOAP, WSDL)

Summary: This article explains why SOAP encoding, also known as "Section 5 encoding," is a shadow from SOAP's past that has no place in the future of Web services. (11 printed pages)

Contents

Introduction
The Evolution of SOAP and Web Services
The Heart of the Problem
A Concrete Example
Solving the Encoding Problem
The Future

Introduction

SOAP is a cornerstone of the basic Web services protocol stack. The SOAP specification formalizes the use of XML messages as a means to communicate. It defines an extensibility model, a way to represent protocol and application faults, rules for sending messages over HTTP, and guidelines for mapping RPC calls to SOAP messages. Having standard ways to do these things is very useful. If we didn't, every developer who wanted to send XML messages via HTTP would have to create their own ad hoc solutions to these problems, making interoperability quite difficult. While most of what the SOAP specification offers us is good, there is one thing that is not good: SOAP encoding. SOAP encoding —sometimes called "Section 5 encoding", after the portion of the SOAP 1.1 specification where it is defined —is a shadow from SOAP's past that has no place in the future of Web services. This article explains why, starting with some history.

The Evolution of SOAP and Web Services

When the first SOAP specification was written, the concepts behind Web services were still in their infancy. People were planning to use SOAP as a way to better integrate distributed object technologies like DCOM, CORBA, and RMI with native Internet technologies such as XML and HTTP. The goal was to build plumbing that produced and consumed XML-based messages instead of the various binary message formats favored by each technology (NDR, CDR, and JRMP, respectively).

In order for the clients and servers in a distributed application to produce and consume messages, they need to know how those messages are supposed to look. Most distributed object systems rely on a combination of compiled proxy/stub/skeleton code and binary representations of metadata (such as COM Type Libraries, CORBA Interface Repositories, or Java .class files) to provide that information. SOAP didn't change this. The authors of the SOAP specification assumed that an application developer would ensure that clients and servers had whatever information they needed to process SOAP messages correctly.

However, the SOAP authors realized that if they were not going to define a common way to describe messages, they should at least provide some guidance for how to map common object-oriented programming constructs to XML. They couldn't use XML Schema (XSD) to solve this problem; it was still far from completion. So they defined a data model based on graphs of untyped structures. Then they wrote the SOAP encoding rules, which explain how to serialize an instance of the SOAP data model to a SOAP message. It was left to SOAP implementers to map their own technologies to the SOAP data model.

As SOAP has gained traction in the industry, a new requirement has emerged. Developers want to download descriptions of a SOAP server's message formats so they can build clients to speak to them. Since a server that wants to provide such a description can't make any assumptions about the technology being used to build a client, exposing message descriptions using an existing metadata format such as a Type Library won't work. The solution is to use a common metadata format that is as portable as SOAP itself, namely, WSDL. WSDL describes the behavior a Web service supports using portTypes. A portType is a collection of operations. Operations are defined in terms of messages. Messages are defined in terms of XML Schema. Today, SOAP and WSDL are fairly inextricably linked in most people's minds; along with UDDI they define the basic Web services building blocks.

The Heart of the Problem

Realizing that XSD had an important role to play in describing SOAP messages, the authors of WSDL embraced it (though they left the door open to alternatives as well). Realizing that there were already toolkits that implemented the SOAP encoding scheme, the authors of the WSDL specification felt compelled to embrace it too. The solution they came up with was to define messages in terms of XML schema constructs and then to allow a binding to apply SOAP encoding to them, if desired.

A binding defines the concrete details you need to know to invoke the operations defined by a portType. (This is not a new idea. For instance, COM classes often exposed their methods via a vtable binding and an IDispatch binding. Similarly, the methods of a CORBA class are typically available via static stubs or a dynamic invocation interface.) When you create a WSDL binding that maps a portType's operations to SOAP messages sent over HTTP, you have to specify whether the SOAP messages contain literal or encoded instances of the schema constructs the operations use. If you choose "literal," you are saying that the XML Schema constructs your WSDL definitions refer to are concrete specifications of what will appear in your SOAP message bodies. If you choose "encoded," you are saying that the XML Schema constructs your WSDL definitions refer to are abstract specifications of what will appear in your SOAP message bodies; these can be made concrete by applying the rules defined by SOAP encoding. (The WSDL specification allows other encoding schemes as well, but alternatives are rarely if ever used.)

This brings us to the heart of the problem. As I explained earlier, the SOAP encoding scheme serializes SOAP data models to XML. How can an encoding scheme for the SOAP data model be applied to abstract XML Schema definitions, when one represents information as a graph of untyped structures and the other represents information as a tree of typed elements? Unfortunately, neither the SOAP specification (which defines SOAP encoding) nor the WSDL specification (which applies it to XML schema definitions) answers this question. In fact, there is no specification anywhere that describes what this means and how it works. And that is a problem.

A Concrete Example

Up to this point, my argument has been very theoretical, so let me provide an example to help make it more practical. Consider the following pseudo code for an operation called Distance that measures the distance between two points.

class Point
{
  public Point() {}
  public Point(int x, int y) { this.x = x; this.y = y; }
  public int x;
  public int y;
}

float Distance(Point p1, Point p2)
{
  … // apply the Pythagorean Theorem
}

Here is a WSDL document that describes a portType called Geometry that contains the Distance operation. It also defines of a binding for the Geometry portType that uses SOAP encoding.

<wsdl:definitions
 xmlns:wsdl="https://schemas.xmlsoap.org/wsdl/"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:soap="https://schemas.xmlsoap.org/wsdl/soap/"
 xmlns:tns="https://www.gotdotnet.com/team/tewald/sample"
 targetNamespace=
           "https://www.gotdotnet.com/team/tewald/sample">

 <wsdl:types>
   <xsd:schema targetNamespace=
           "https://www.gotdotnet.com/team/tewald/sample">

     <!-- Point type used by Distance operation -->

     <xsd:complexType name="Point">
       <xsd:sequence>
         <xsd:element name="x" type="xsd:int" />
         <xsd:element name="y" type="xsd:int" />
       </xsd:sequence>
     </xsd:complexType>

   </xsd:schema>
 </wsdl:types>

 <!-- RPC style message definitions -->

 <wsdl:message name="DistanceInput">
   <wsdl:part name="p1" type="tns:Point" />
   <wsdl:part name="p2" type="tns:Point" />
 </wsdl:message>

 <wsdl:message name="DistanceOutput">
   <wsdl:part name="result" type="xsd:float" />
 </wsdl:message>

 <!-- Geometry portType -->

 <wsdl:portType name="Geometry">
   <wsdl:operation name="Distance">
     <wsdl:input message="tns:DistanceInput" />
     <wsdl:output message="tns:DistanceOutput" />
   </wsdl:operation>
 </wsdl:portType>

 <!-- Binding for Geometry portType that
      uses SOAP encoding -->

 <wsdl:binding name="GeometryBinding" type="tns:Geometry">
   <soap:binding style="rpc"
    transport="https://schemas.xmlsoap.org/soap/http" 
   <wsdl:operation name="Distance">
     <soap:operation soapAction="" style="rpc" />
     <wsdl:input message="tns:DistanceInput">
       <soap:body
        namespace="http://www.gotdotnet/team/tewald/sample"
        use="encoded" />
     </wsdl:input>
     <wsdl:output message="tns:DistanceOutput">
       <soap:body
        namespace="http://www.gotdotnet/team/tewald/sample"
        use="encoded" />
     </wsdl:output>
   </wsdl:operation>
 </wsdl:portType>

</wsdl:definitions>

Imagine you are going to implement a service that exposes this portType and binding. You want your implementation to check that the messages it receives from clients match the WSDL specified format. If they don't, you can discard them and return a fault, without having to do any other work. So, what constitutes a correct message?

Consider a client that passes two different Point instances as the arguments to the Distance operation, as in the following:

Point one = new Point(10, 20);
Point two = new Point(100, 200);
float f = proxy.Distance(one, two);

Here is the client's serialized request message.

<soap:Envelope xmlns:soap=
         "https://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body soap:encodingStyle=
         "https://schemas.xmlsoap.org/soap/encoding/">
    <ns:Distance xmlns:ns=
         "https://www.gotdotnet.com/team/tewald/samples">
      <p1>
        <x>10</x>
        <y>20</y>
      </p1>
      <p2>
        <x>100</x>
        <y>200</y>
      </p2>
    </ns:Distance>
  </soap:Body>
</soap:Envelope>

At first glance, it seems pretty clear that each of the instances, p1 and p2 (named for the formal parameters to ns:Distance), match the schema definition for the Point type in the WSDL document. They each have a sequence of two elements, x and y, the values of which are integers. But this conclusion is not well-founded. While the SOAP data model uses the XML Schema simple types (xsd:int for example) to describe individual values such as x and y, it does not use XML Schema complex types to describe structured data (which is why I said the SOAP data model is based on untyped structures). If the SOAP data model doesn't use XML Schema complex types and SOAP encoding is based on the SOAP data model, is it wise to conclude that p1 and p2 are SOAP encoded instances of the complex type Point?

You may think I'm splitting hairs – after all, p1 and p2 do look an awful lot like Point. So, now, consider a client that passes the same Point instance as both arguments when it invokes the Distance operation, as in:

Point one = new Point(10, 20);
float f = proxy.Distance(one, one);

Here is the client's serialized request message. In this case, the client uses the SOAP encoding scheme for "multi-reference accessors," that is, the single Point instance (one) referenced as both parameters to Distance, p1 and p2.

<soap:Envelope xmlns:soap=
         "https://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body soap:encodingStyle=
         "https://schemas.xmlsoap.org/soap/encoding/">
    <ns:Distance xmlns:ns=
         "https://www.gotdotnet.com/team/tewald/samples">
      <p1 href="#id1" />
      <p2 href="#id1" />
    </ns:Distance>
    <ns:Point id="id1"
     xmlns:ns="https://www.gotdotnet.com/team/tewald/samples">
      <x>10</x>
      <y>20</y>
    </ns:Point>
  </soap:Body>
</soap:Envelope>

In this case, it is clear that p1 and p2 do not match the schema definition for the Point type. In fact, they aren't even close. They each have an undefined href attribute and neither has the mandatory child elements x and y. The definition for the Distance operation's input message suggests that p1 and p2 are instances of Point, but clearly, in the presence of the SOAP encoding, there are some situations where that is not the case.

So where does this leave us? Sometimes, the elements p1 and p2 look like serialized instances of Point, and sometimes they don't. In theory, the SOAP encoding scheme should tell us what p1 and p2 should look like, except that there is no defined way to apply the SOAP encoding scheme to types specified using XML Schema, such as Point. In the absence of such a definition, implementing a server that exposes the Geometry portType and binding and ensures all the messages it receives are correct is extremely difficult.

Solving the Encoding Problem

At this point, you may be thinking that this isn't really your problem because you don't implement your Web services from scratch. Instead, you use a Web service toolkit and count on it to handle these details for you. But that doesn't resolve the issue; it just shifts the problem from you to toolkit implementers, who have to try to figure out what to do. So far, they've done a reasonable job providing support for SOAP encoding (despite the problems I described in the previous section) but the cost has been high. Most toolkits use separate code paths for literal and encoded bindings; in essence, implementing two marshaling layers, one for each case. There is general consensus among developers who have spent time working on this problem that we need a better solution. Somehow, we have to reconcile SOAP encoding and XML Schema.

Luckily, there is a fairly straightforward solution. The SOAP encoding rules define a transformation of the SOAP data model to XML messages. It doesn't make sense to apply the SOAP encoding to definitions in XML Schema because there is no defined relationship between the two. But what if you wrote a schema that described the XML messages that are produced by SOAP encoding?

Here is a revised WSDL definition for the Geometry portType that uses this approach.

<wsdl:definitions
 xmlns:wsdl="https://schemas.xmlsoap.org/wsdl/"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:soap="https://schemas.xmlsoap.org/wsdl/soap/"
 xmlns:tns="https://www.gotdotnet.com/team/tewald/sample"
 targetNamespace=
           "https://www.gotdotnet.com/team/tewald/sample">

 <wsdl:types>
   <xsd:schema targetNamespace=
           "https://www.gotdotnet.com/team/tewald/sample">

     <!-- Point type used by Distance operation,
          note use optional attributes and
          element content -->

     <xsd:complexType name="Point">
       <xsd:sequence minOccurs="0">
         <xsd:element name="x" type="xsd:int" />
         <xsd:element name="y" type="xsd:int" />
       </xsd:sequence>
       <xsd:attribute name="id"
            type="xsd:ID" use="optional" />
       <xsd:attribute name="href"
            type="xsd:anyURI" use="optional" />
     </xsd:complexType>

   </xsd:schema>
 </wsdl:types>

 <!-- RPC style message definitions -->

 <wsdl:message name="DistanceInput">
   <wsdl:part name="p1" type="tns:Point" />
   <wsdl:part name="p2" type="tns:Point" />
 </wsdl:message>

 <wsdl:message name="DistanceOutput">
   <wsdl:part name="result" type="xsd:float" />
 </wsdl:message>

 <!-- Geometry portType -->

 <wsdl:portType name="Geometry">
   <wsdl:operation name="Distance">
     <wsdl:input message="tns:DistanceInput" />
     <wsdl:output message="tns:DistanceOutput" />
   </wsdl:operation>
 </wsdl:portType>

 <!-- Binding for Geometry portType that
      uses SOAP encoding -->

 <wsdl:binding name="GeometryBinding" type="tns:Geometry">
   <soap:binding style="rpc"
    transport="https://schemas.xmlsoap.org/soap/http" 
   <wsdl:operation name="Distance">
     <soap:operation soapAction="" style="rpc" />
     <wsdl:input message="tns:DistanceInput">
       <soap:body
        namespace="http://www.gotdotnet/team/tewald/sample"
        use="literal" />
     </wsdl:input>
     <wsdl:output message="tns:DistanceOutput">
       <soap:body
        namespace="http://www.gotdotnet/team/tewald/sample"
        use="literal" />
     </wsdl:output>
   </wsdl:operation>
 </wsdl:portType>

</wsdl:definitions>

The updated definition of the Point type includes declarations for two new attributes, id and href, which are optional, as is the type's element content. These changes to Point allow instances equivalent to those produced by SOAP encoding when multi-reference data is serialized. An instance of Point can include an id attribute and x and y element children or a href attribute with no children. The former represents an instance of Point; the latter represents a reference to a Point. Because the XML Schema definition already does the equivalent of SOAP encoding, there is no need to apply the encoding rules to it, so the binding for the Geometry portType has been updated to use literal schema types (the encodingStyle attribute has been removed), thereby removing any ambiguity about whether the parameters to the Distance operation, p1 and p2, are instances of Point.

Does this approach still use SOAP encoding? That depends on your perspective. It does essentially the same thing SOAP encoding does without trying to apply the SOAP encoding rules directly to a schema, which is bad. The key benefit is that the XML Schema definition included in the WSDL accurately describes the messages the portType and binding require. That makes implementing a service that checks whether messages are correct before processing them significantly simpler.

There are a couple of steps we need to take to gain widespread support for this approach. First, we need to define standard, global attributes for representing references to nodes in a serialized graph. In my example, I defined the local id and href attributes as part of the Point type. The problem with that approach is that every type that might be used in a serialized graph would have to define equivalent attributes with exactly the same semantics. Worse, tools would have to assume that any type that had an id and href attribute intended them to be used for describing a graph. If we create some global attributes and define types that reference them, we won't have to define attributes per-type and toolkits will only have to recognize one pair of attributes. (The WS-I Basic Profile Working Group has done work on this already as part of its effort to help resolve Web service interoperability problems, though that work has not been published at the time of this writing.)

Once standard global attributes exist, toolkit implementers can adopt them. There are basically two changes they would have to make. First, their WSDL tools would have to be updated to produce and consume XML Schemas that include reference attributes for any type that might be used to represent a serialized graph node. Then they'd have to update their runtime plumbing to serialize and deserialize graphs to XML messages that use these attributes.

The Future

Many people, myself included, believe that a shift away from SOAP encoding is inevitable. The W3C XML Protocol Working Group's current draft of the SOAP 1.2 specification makes support for SOAP encoding optional (that is, a toolkit can claim SOAP 1.2 compliance without supporting SOAP encoding), the WS-I Basic Profile Working Group's current draft of its interoperability guidelines disallows the use of SOAP encoding with SOAP 1.1, and the W3C Web Service Description Working Group chose to drop support for encoding from their latest working draft of the WSDL 1.2 specification.

It will take some time for toolkits to reflect these changes, first we have to settle on a schema-friendly way to serialize graphs. Then toolkits have to be updated. This will take some time, but it is worth the wait. In the end, the Web service stack will be significantly easier to implement and use.