DIME: Sending Binary Data with Your SOAP Messages

 

Matt Powell
Microsoft Corporation

January 22, 2002

Introduction

One of the key strengths of SOAP is the ability to encapsulate XML data within a SOAP message. This gives SOAP the flexibility to contain data from any XML schema, which is really quite empowering. Despite the strong typing of XML, however, there are times when data does not come to us as XML. The Direct Internet Message Encapsulation (DIME) specification defines a mechanism for packaging binary data with SOAP messages, so that when sending data not in XML format, applications need not be constrained by the SOAP specification. In this article, we will look at why DIME was created, what DIME looks like, and how DIME fits into XML Web Services. In short this is going to be (ahem) my two cents on DIME.

Why DIME?

There are a couple of questions we need to look at in order to understand why DIME was created in the first place and why it is quickly becoming a standard within the XML Web Services world. Certainly one of the key strengths of SOAP has always been that it was an XML message format and thus inherited all the advantages that XML provides. So why do we need to provide support for sending binary data with an XML message? Is not the world going toward XML schemas?

Why Not XML?

Although the flexibility and strongly typed schemas of XML are intriguing enough that a significant portion of the data industry is moving to an XML-centric world, we are far from living in an XML-only world. There are plenty of legacy systems not currently speaking XML, and they may never do so. This leaves a couple of options if we want to interoperate with legacy systems or the data of legacy systems.

First, we can create layered architectures that convert data from legacy formats to XML and back—but sometimes this may not make sense. For instance, if you had an electronic data interchange (EDI) system generating EDI documents for transfer between two businesses, there is a good chance that systems on both sides are already in place that only speak EDI. If you want to use the capabilities of XML Web Services to take advantage of your HTTP infrastructure for sending messages, you could convert your EDI documents to XML for transferring with SOAP. But now you need to not only convert the documents to XML for data transfer; you also need to reconvert the XML back into the legacy EDI format for the other side to process. Clearly there would be efficiencies gained by simply including the data in its legacy format, so that it wouldn't have to be serialized and deserialized for the simple act of transferring the message.

There are also situations where the act of serializing data into XML will cause other inefficiencies. In particular, there are cases where serializing data into XML is unwise because an efficient binary compression scheme is already required to insure that the data is not too large. Take for instance image files that can come in a number of different formats but tend to be transferred across the Web in mostly JPEG or GIF formats. Both of these formats for holding image data is highly structured and could be converted into an XML schema. Nevertheless, images are large enough as it is, and the processing required to serialize the data to and from XML would involve a horrendous slow down to a mechanism that is well accepted and efficient.

Another situation where you would not want to mess with binary data associated with your SOAP request is if you have binary data that is digitally signed. Any modifications of the data will invalidate the signature, so it is important that it not be converted into XML for transfer with our SOAP request.

Whether we are looking at binary data for legacy systems, including data in already compressed data formats, or are dealing with binary data that needs to stay intact due to signature requirements, sometimes it just makes sense that binary data should accompany our SOAP request. Therefore a mechanism is required to allow for both a SOAP message and accompanying binary data to be sent in the same transmission unit. That mechanism must have a means for marking record boundaries, so that it is possible to tell where the different portions of the coalesced data begin and end. So let's look at the DIME syntax, and how it allows you to include binary data with your SOAP request.

The DIME Details

We will see a SOAP message encapsulated in a DIME package in a bit, but first we have to learn more about how DIME works. DIME is ultimately just a specification for including multiple binary records within a single package. The records could contain any kind of data, including image files, SOAP messages, or even MIME messages. There is no restriction on the size or format of the data. You do not even need to know the length of the total data you are sending when using DIME. We can get a good feel for the features of DIME by looking at the format for a DIME message, and seeing exactly what kind of information is contained in each of its fields.

Figure 1. DIME message organization

Figure 1 shows the record organization within a DIME message. A DIME message consists of one or more records with no restriction on the number of records in the entire message. Each record has headers associated with it (designated by the light green sections at the top), and data (designated by the dark green). Among other things, the record header includes various flags. These include a flag to indicate that a record is the first in the DIME message, and another flag to indicate that a record is the last in the DIME message.

Notice that the size of the data in each record can vary in length. The sequence of the data records is significant, and must be maintained over whatever channel is being used to transmit the DIME message. By using the begin and end message flags, DIME eliminates the need for an application to know the precise length of the entire DIME message before it starts to send it. When an application has completed transferring a DIME message, it simply sets the end message flag on the last data record.

Next we will look at the specifics of the data record format.

Figure 2: DIME Record Format

The data record format is shown in Figure 2 in two parts, again with the headers in light green and the data in dark green. The portion of the headers above the dotted line is a fixed length of 64 bits (the first two lines each represent 16 bits and the third line represents 32 bits). The first three bits shown in the first line is a bitmask that represents three different flags that describe the record. The first two bits are used to indicate the two flags that we saw in Figure 1. MB is the Message Begin Flag and ME is the Message End Flag. The third bit is the Chunked Flag (CF), which indicates that this record is part of a chunked data representation. We will talk about chunking data shortly.

The rest of the first 16 bits of the header is used to indicate the length of the ID field in the header. The ID field is a variable length field that provides a mechanism for identifying a particular record within a DIME message. For instance, a SOAP message in one data record of a DIME package may need to refer to an image file that is in a different data record of the DIME package. The SOAP message can refer to the image file by indicating the ID in the image file's data record. We will look at an example of this shortly.

The second 16 bits of the data record header describes the variable length Type field that follows the ID field. The Type field is used to associate the data in the data record with some kind of type specification. The three-bit Type Name Format field indicates what kind of mechanism is being used to describe the data type. For instance we may want to specify a type like we do with the HTTP Content-Type header with a string like "text/html". Another option would be to indicate the type by specifying a URI, like we do when we define a specific XML schema. The Type Name Format field allows us to use either of these mechanisms. A value of 0x01 indicates the "text/html" format in the Type field, while a value of 0x02 indicates the URI format. These are the only two formats defined at this time, in addition to a value of 0x00 used in chunked data representation that we discuss below. The remaining 13 bits represented on the second line of Figure 2 indicate the length of the type field.

It is worth noting that the ID, Type, and Data fields in a data record are actually padded to the nearest 32-bit boundary. Therefore if the Type Length field indicates a size of 9 bytes, then the Type field will be padded with 3 more bytes of data so that the data portion of the data record immediately following the type field will start on an even 32-bit boundary. In the examples shown in this article, the padded bytes are not represented for sake of simplicity.

The third line in Figure 2 is simply the length in bytes of the data in the data record. The data length is a 32-bit field, so it specifies a maximum data size of 4 gigabytes (GB). This is a potentially limiting restriction on the size of the data that may need to be packed into a DIME data record. Fortunately, DIME has an excellent solution for avoiding the data size limitation—chunking.

Chunked Data

The ability of DIME to allow for chunked data records serves several purposes. The first, as we mentioned, allows you to send data larger than the 4-gigabyte data-length field. The second is that when an application needs to send large amounts of data, it becomes harder to allocate a single buffer to hold the data as the size of the data increases. For instance, if you are trying to transfer a 3 GB file, chances are pretty good that you will be unable to allocate a single buffer of memory to hold the entire file. By chunking the data, you can read 100 kilobytes of data from the file at a time, and send it along in your data record.

Finally there are many cases where you just do not know how much data is being generated for the message. For instance, you may be sending the results of a database query for which you have no idea how many records will be returned. Chunking allows you to send the data in your DIME package as you get it. When you have sent all your data, you simply turn off the Chunked Flag in the final data record to indicate this is the last part of the chunked data. The first data record in a chunked transfer can specify an ID and Type, but all remaining data records in the chunked transfer have a zero length ID and Type field (since the data ID and Type for all the chunks is the same as the initial chunk).

SOAP and DIME

It's one thing to talk about bits and chunks and look at colorful squares, but it is another to see what an actual DIME message might look like. We are going to look at an example where a SOAP request is being sent along with a JPEG image file. In this case, a SOAP RPC call is made to a Web Service in order to convert the attached JPEG image into GIF format. For the sake of brevity, namespace declarations were removed.

________________________________________________
1 0 0 0000000000000
010 0000000101001
00000000000000000000000010110110
https://schemas.xmlsoap.org/soap/envelope/
<envelope>
    <body>
        <ConvertImage>
            <ConversionType>JPEGtoGIF</ConversionType>
            <image href="Image1" />
        </ConvertImage>
   </body>
</envelope>
________________________________________________
0 0 1 0000000000110
001 0000000001010
00000000000000001111111111111111
Image1
image/jpeg
     <64K of binary data>
________________________________________________
0 1 0 0000000000000
000 0000000000000
000000000000000000011000111110000
     <12784 bytes of binary data>
________________________________________________

The preceding excerpt represents a DIME message with three data records separated for clarity by horizontal lines. The fixed portion of the headers are represented in binary format, so that we can see the specific values of the flag fields. The ID, Type, and Data are simply displayed as text where appropriate.

If you look at the fixed length headers in the first data record, you will notice that the begin message flag is set in the first bit and there is no ID field (it has zero length). The type field is in URI format and indicates the SOAP message schema identified by the https://schemas.xmlsoap.org/soap/envelop/ URI. The actual body is our SOAP message that requires the additional binary data. The key thing to notice here is that the second parameter of our SOAP RPC call has an href attribute with the relative URI "Image1" as the value. This is the ID specified in the Data Record ID field of the image portion of our DIME message.

The second data record in our example is where the "Image1" ID is specified. The ID length is non-zero, so an application will be able to determine that this data record has an ID, and it can make the comparison with the value indicated by the href attribute in the SOAP message. The type is specified with the media type flag (the leading 001 in the second row of this data record). This means that we are using a content type mechanism for identifying the sort of data we have. In this case the type field contains "image/jpeg" to indicate what kind of data is in our data field.

Notice that in this data record, the Chunked Flag is turned on (third bit in the first row), which means we are starting a chunked data transfer. The application that is sending this message broke the JPEG data into two chunks; the first is 64k, and the second is 12784 bytes, for a total size of 78320 bytes. The size of each chunk is indicated in the third fixed-length header row of each data record.

The third and last data record has the End Message flag set (the 1 in the second bit of the first line) and the Chunked Flag is not set (the 0 immediately following the End Message flag). The Chunked Flag being turned off indicates that this is the last chunk of data in this chunked sequence. Notice that there is no ID or Type field, since it was specified in the first chunk in the sequence. The End Message flag lets us know that this is the end of our DIME message.

DIME Versus MIME

If you have been following the work Microsoft and others have been doing with SOAP, you might recall a specification called SOAP Messages with Attachments. Even if you have never seen the SOAP Messages with Attachments specification, you may be familiar with Multipurpose Internet Mail Extensions (MIME) and in particular MIME multipart. Among other things, one of the biggest uses of MIME multipart is to send e-mail messages with attachments.

Beyond the similar names, you may have been thinking a lot about MIME while reading this article, since DIME functionally does the same thing as MIME multipart—it allows you to package data records in a single message. The SOAP Messages with Attachments specification basically indicates that you can package SOAP requests with related data in a MIME message, and it defines a mechanism for referencing the related data from within the encapsulated SOAP request. This sounds a lot like what DIME does.

So why was DIME created when there was already a solution for the problem that DIME was created to address? What's wrong with the MIME approach to this sort of problem? The answer to these questions is several fold.

The MIME approach to separating data records is different than specifying the data record length in the headers for each data record. MIME separates its "data records" with a unique string. The string is defined at the beginning of the MIME message, and then an application scans the data in the message until it finds another instance of the string. The application then knows that it has found a data record boundary.

The first problem with the MIME approach is that you have to scan through all the data to find the separator strings. If you want to get at the data in the 5th data record, you need to scan through all the data in the message until you find the 5th occurrence of the separator string. With DIME, the length of each data record is easily calculated, so you don't have to scan through all the data to step to the fifth record. You simply step through the data record headers, achieving significant performance gains.

Another problem with the separator string approach to message partitioning: if by chance the string exists within the data of a data record, then you will potentially break the format of the MIME message. Now if you are building a MIME message from scratch, you can take precautions to insure that you choose a separator string that does not exist in any of the message data. The route and forward vision for SOAP messages, however, is such that intermediaries may process all or part of a SOAP message and forward the results to another party. This means that an intermediary may add or change data in the message, and that new data could contain the separator string. Again the message format would be broken.

Memory allocation is also more straightforward with DIME than MIME. When you know the lengths of the data being received you can simply allocate buffers for the given size and stream the data into them. For MIME, which does not have to include a data length, the incoming buffers will be allocated without knowledge of how much data is being received. Applications will either have to guess at the size of incoming data and deal with the inefficiencies of expanding heap allocations, or establish an additional requirement on the sorts of MIME messages the application will receive. There is a standard for indicating data record lengths in MIME over HTTP. However, no standard has been created for other transports, such as TCP. Supplemental standards would have to be created to allow for such capabilities.

MIME is exceptionally flexible—which is not always a good thing. You can functionally add any MIME header fields that you want to a MIME message. Header lengths are unknown, and you have to parse through all the header fields in order to determine if any or all of them impact your application's processing of the encapsulated data. From an XML Web Services standpoint this makes a MIME-based solution much more complicated because the focus is on XML rather than MIME. Using both XML and MIME forces the question of what functionality goes where and increases the risk of loosing interoperability.

DIME, however, has a small, fixed set of headers that provide the functionality required by a SOAP message. The simplicity makes processing fast and efficient, and ultimately leaves little room for misinterpretation. Simplicity has its benefits.

The benefits of DIME over MIME fall into two main categories:

  • Performance: DIME does not require any sort of encoding of binary data, and the DIME approach to specifying Data Record lengths, as opposed to specifying separator strings, makes parsing faster, and memory allocation more efficient.
  • Simplicity: DIME is designed for simplicity whereas MIME is designed for flexibility. This means that it will be easier for tools for XML Web Services to be developed that support DIME, while decreasing the risk of interoperability issues.

Although DIME has a simpler syntax than MIME, it is more flexible in one regard: MIME limits its content-type specifications to the "text/html" type of syntax. This kind of content-type mechanism requires registering new content types with the Internet Assigned Numbers Authority (IANA). DIME supports both the MIME content-type mechanism, as well as content types specified with the URI mechanism. This allows anyone with a registered domain to create their own content-type definitions, if they need to.

In the end, the MIME approach to sending SOAP messages with attachments is a technically plausible solution. However, supporting multiple specifications for achieving the same results is costly and potentially confusing. Due to its technical superiority, Microsoft will be motivated to focus on DIME for binary attachments in its future SOAP tools and platforms.

Conclusion

DIME is a specification that addresses real-world needs for including binary data with SOAP messages. It is an extremely efficient mechanism for including multiple data objects within a single message. It allows for flexibly specifying data type, provides a means for cross-referencing data objects within a DIME message, and provides all the advantages that data chunking embodies. DIME is a major part of Microsoft's future SOAP and XML Web Service strategy.

In the next At Your Service, Scott will be back with the first of several articles we will be writing on the promising aspects of UDDI, and the use of industry standardized WSDLs to create truly powerful Web applications.

__________________________

H. Nielsen, H. Sanders, E. Christensen, "Direct Internet Message Encapsulation (DIME)" Work In Progress, http://www.ietf.org/internet-drafts/draft-nielsen-dime-01.txt, Microsoft, November 2001. The URL for this reference is subject to change without notice due to modifications made to the draft. The end of the URL will be incremented for each new version so that the "01.txt" portion of the URL will become "02.txt", "03.txt", and so forth. At some point in the future the specification for Direct Internet Message Encapsulation will be removed from the IETF Internet Drafts and may become part of the IETF RFCs.

 

At Your Service

Matt Powell is a member of the MSDN Architectural Samples team, where he helped develop the groundbreaking SOAP Toolkit 1.0. Matt's other achievements include co-authoring Running Microsoft Internet Information Server from Microsoft Press, writing numerous magazine articles, and having a beautiful family to come home to every day.