April 2012

Volume 27 Number 04

BizTalk - Batching EDI Data in BizTalk Server 2010

By Mark Beckner | April 2012

With the expandedElectronic Data Interchange (EDI) capa­bilities available in Microsoft BizTalk Server 2010, more companies are looking at how to leverage it within their environments. The batching of EDI data is some of the most valuable functionality that the platform can provide, yet it can be confusing and complex to implement. Using the steps outlined in this article, you’ll learn how to quickly and easily extract data from a source database and implement mapping and batching using several scenarios. Included in the discussion is information about configuring the SQL Adapter to retrieve data using FOR XML.

Solution Overview

Working with batched data in BizTalk Server 2010 can be quite complex, but much of this complexity can be removed if you think through your architecture and decide the best place to handle the various aspects of batching. In order to understand the primary components of batching, you’re going to work through the creation and configuration of each. You’ll start with creating a stored procedure that extracts the source data in a format readily consumable by BizTalk—and in a format that allows for various options of how you might want to batch your data. Next, you’ll look at creating a schema and options for mapping the data to the target EDI format. Finally, you’ll set up batching on a BizTalk Party Agreement so the data can be created and written out to a file. In order to work through this solution, you’ll need BizTalk Server 2010, Visual Studio 2010 and SQL Server 2008 R2.

Extracting Data from SQL Server Using FOR XML and XMLNAMESPACES

There’s a powerful option when querying data from SQL Server for use in BizTalk. This option is FOR XML, which allows data to be retrieved in a specific XML format. XML is the foundation upon which all data in BizTalk is based. When extracted as XML, the data from a stored procedure can be immediately ready for consumption by orchestrations and maps without the need to generate complex artifacts that are generally required when communicating through the various SQL adapters. The FOR XML approach has some great benefits:

  • It allows a SQL Server developer to write critical pieces of the integration architecture without having to learn BizTalk.
  • Additional fields can be added to the results and incorporated into BizTalk components with relative ease.
  • It’s an efficient way to process data and it reduces the number of components that must be developed.
  • It can be formatted to match target schemas in order to simplify mapping.

There are a number of options related to using FOR XML in SQL Server. The most appropriate way to use it when dealing with data created for BizTalk is to declare a specific namespace using XMLNAMESPACES and to format the XML specifically as you want it using the PATH mode (as opposed to letting SQL Server automatically name it for you, using the AUTO mode). XMLNAMESPACES is a simple clause that can be added prior to the SELECT statement that creates the XML, and the PATH mode ensures that you can create any type of hierarchy and combination of attributes and elements that you need in order to render the data in the appropriate format.

Figure 1 Stored Procedure Using FOR XML and XMLNAMESPACES

CREATE PROCEDURE [dbo].[GetData] AS
BEGIN
 -- define the namespace and prefix
 WITH XMLNAMESPACES('https://sql.claims.extract' as "ns0")
 -- top level is set to NULL
 SELECT NULL
       ,(SELECT ClaimData.ClaimType As [ns0:ClaimType]
             ,ClaimData.ClaimNo As [ns0:ClaimNo]
             ,ClaimData.DateFrom As [ns0:DateFrom]
             ,(SELECT first As [ns0:FName]
                     ,last As [ns0:LName]
                     ,birth As [ns0:BDate]
              FROM Members
              WHERE Members.ID = ClaimData.MemberID
              FOR XML PATH('ns0:Member'), TYPE)
       FROM ClaimData 
     FOR XML PATH('ns0:Claim'), TYPE)
   FOR XML PATH('ns0:ClaimExtract'), TYPE
END

In addition to formatting the data, you should consider adding business rules in the stored procedure. The more logic you can add at this level, the easier the solution will be to maintain and to develop—no need to recompile code or go through complex testing procedures. Simply update the stored procedure and test that the correct business rules are being applied to the result set. In many cases business rules can be built into the stored procedure, rather than trying to deal with them through the Business Rules Engine (BRE) in BizTalk. Creating a strong BizTalk architecture begins by ensuring that the most appropriate technology is used at each stage.

Configuring a SQL Adapter to Pull Data

Now that you have a stored procedure that returns XML, the next step is to configure BizTalk Server to retrieve the data. This can be done by using the standard SQL Adapter that ships with the product. Developers might find working with the SQL Adapter to be extremely cumbersome. In my experience it usually requires the generation of a number of schemas that are used for mapping the results to various target artifacts, and is generally neither intuitive to work with nor easy to maintain and extend. However, in some cases—like the one I’m going to look at now—it’s amazingly simple to use and provides a great service.

In the case of extracting XML data straight from a stored procedure, there are several excellent benefits that using the SQL Adapter provides: It’s easy to configure, requires no additional BizTalk schemas and can be configured to run on a schedule. Though you’ll find it useful for the extraction of FOR XML data, in general you should use a Microsoft .NET Framework class to interact with SQL Server.

To use the SQL Adapter to call the stored procedure, begin with creating a new BizTalk Receive Location. The Receive Location should be of type “SQL,” and the pipeline can be set as the standard PassThruReceive pipeline. There are several fields that deserve discussion:

  • Document Root Element Name This is a container node for the result set. You may already have a root element defined in the XML result set from the stored procedure, but the adapter will wrap it in an additional node. This is useful if you need to split the result set into individual documents using an envelope schema—but not every solution needs to do this.
  • Document Target Namespace This namespace is used by the schema. You can make it something similar to what you have as the namespace declared in the stored procedure.
  • SQL Command This is the EXEC command that you use to call the stored procedure. In the case of the stored procedure outlined previously, this value would be EXEC GetData. You can easily add parameters. For example, if you wanted to call GetData for a specific Trading Partner and for a specific number of records, you could enter EXEC GetData ‘TradingPartnerName’, ‘100’.

Once the Receive Location and associated Receive Port have been created, you can create a simple Send Port that subscribes to the Receive Port and writes the XML out to a file drop. To do this, just set the BTS.ReceivePortName in the Send Port’s filter to the name of the Receive Port you created for the SQL Adapter. Once everything is enabled and running, you should see an output similar to that shown in Figure 2.

Figure 2 Sample Result of SQL Adapter’s Call to Stored Procedure

<DataResultSet xmlns="https://SQLExtract.DataResultSet">
  <ns0:ClaimExtract xmlns:ns0="https://sql.claims.extract">
    <ns0:Claim xmlns:ns0="https://sql.claims.extract">
      <ns0:ClaimType>Institutional</ns0:ClaimType>
      <ns0:ClaimNo>ABC100</ns0:ClaimNo>
      <ns0:DateFrom>2012-01-01T00:00:00</ns0:DateFrom>
      <ns0:Member xmlns:ns0="https://sql.claims.extract">
        <ns0:FName>John</ns0:FName>
        <ns0:LName>Doe</ns0:LName>
        <ns0:BDate>1975-01-28T00:00:00</ns0:BDate>
      </ns0:Member>
    </ns0:Claim>
    <ns0:Claim xmlns:ns0="https://sql.claims.extract">
      <ns0:ClaimType>Institutional</ns0:ClaimType>
      <ns0:ClaimNo>XYZ200</ns0:ClaimNo>
      <ns0:DateFrom>2012-01-05T00:00:00</ns0:DateFrom>
      <ns0:Member xmlns:ns0="https://sql.claims.extract">
        <ns0:FName>Jane</ns0:FName>
        <ns0:LName>Doe</ns0:LName>
        <ns0:BDate>1976-10-08T00:00:00</ns0:BDate>
      </ns0:Member>
    </ns0:Claim>
  </ns0:ClaimExtract>
</DataResultSet>

Creating a Schema from the Query Results

The creation of the actual BizTalk XSD based on the XML retrieved from the stored procedure in the SQL Adapter can be accomplished through the use of a wizard. To create the schema, take the following steps:

  1. Right-click your BizTalk Visual Studio project that you want to add the schema to and select Add | Generated Items.
  2. In the window that opens, click on Generate Schemas.
  3. In the Generate Schemas dialogue box, set the Document Type property to “Well-Formed XML.” Note that by default, this isn’t initially available. An error will pop up outlining what file needs to be run in order to enable this functionality.
  4. Set the Input file to an instance of the XML shown in Figure 3 and click OK.

The Generated XSD
Figure 3 The Generated XSD

These steps will add two XSDs to your project. You can decide how you want to use them, and you can also combine them into a single XSD if you prefer. An example of the top-level schema is shown in Figure 3.

Mapping and Batching Options

You now have your source data ready to be mapped to your EDI format. Thought needs to be put into how best to format and batch the EDI documents that are going to be sent. Some trading partners might require multiple records within a single ST/SE group, while others might require single records within this grouping. There could be limitations imposed on the total number of records within a single document or a single ST/SE, and there likely will be requirements around when a batch gets created and delivered. The 837 (Health Care Claim Specification) document format provides an excellent example. Trading partners might, for example, require a maximum of 2,500 claims be present within each ST/SE group and a maximum of 20 individual ST/SEs be present within a single document. Assessing the requirements of the various parties with which you’re exchanging information will lead you to determining how best to structure your data’s path through BizTalk.

As for the actual BizTalk map and associated batching, two basic options are available. One is to map your data to the target EDI schema in a single ST/SE group; the second is to map multiple records in your source to a single ST/SE group. Depending on which route you need to take, you might have to set up an envelope schema to split the source data, and you’ll have some differences in your mapping and in your batch configurations.

One Record per Individual ST/SE Begin by looking at the scenario where the source data will be split and batched as single records within individual ST/SE groups. The sample source data shown in Figure 3 has two claims in it. The goal is to take the individual claims (<ns0:Claim>) and split them out so that they can be mapped individually to the target 837 schema. Once mapped, the data can be batched as it arrives in BizTalk. In order to split the data you’ll need to create an envelope schema:

  • Set the Envelope property on the schema to “Yes.”
  • Set the Body XPath on the root segment of your document to the node that contains the claim nodes—in this case, the ClaimExtract node.
  • Deploy the envelope schema. The moment that a file is received on a Receive Location that has the default XMLReceive pipeline implemented, it will be split. There’s no need to reference the envelope schema anywhere.
  • If you set up a Send Port to subscribe to the Receive Port that’s receiving the doc to split, the Send Port will write out individual Claim XML files. You can then use these as the source of your mapping to your target EDI schema. The Send Port should have the map of the source to the EDI transaction on it.

At this point there are many individual EDI documents stacked up in a file directory. You can now set up a second Receive Location/­Send Port combination that does the batching, as follows:

  • In your Receive Location, set the Receive Pipeline to EdiReceive. It will pick up the individual EDI documents that were just written out by the first Receive/Send combo.
  • In your Send Port, set the filters as shown in Figure 4. Note that the EDI.BatchName and the EDI.DestinationPartyName are set to the information in the Party that will be configured in the next steps. This Send Port, with EDISend specified as a pipeline, can be set to write to a File Drop—the output will be the batched data.

Filters on the Batching Send Port
Figure 4 Filters on the Batching Send Port

  • Create a new BizTalk Party and Agreement that represents the data exchange with the Trading Partner that will be receiving the batched documents. The main fields to configure are:
    • On the Identifiers tab, set the Sender and Receiver IDs to appropriate values. The DestinationPartyName should match what you set in the Send Port filter.
    • On the Local Host Settings and Envelopes tabs, set up the properties for the specific EDI document(s) you’re handling.
    • On the Batch Configuration tab, set the following properties:
      • Set the Batch Name equal to what you set it to in the Send Port filter.
      • Set the Batch Filter with two properties:
        • BTS.ReceivePortName == [The name of your receive port where the individual EDI documents are coming in]
        • EDI.ST01 != 997
      • Set the Release option. This will most likely be based either on schedule (deliver as many documents as have queued up over the last 24 hours) or on total number (send this batch when there are a total of 2,500 items queued). If releasing on the “Maximum number of transaction sets in” property, set the dropdown to Group.
      • Once the properties have been set, click Start to activate the batching. It can take several minutes before the batch is completely started. Continue to click the Refresh button until the “Batching is activated” message appears. There should be an instance of the batch orchestration running in BizTalk (this can be seen via the Running Instances of the BizTalk Group Hub report). This orchestration is part of the BizTalk EDI Application that’s installed with the EDI components in BizTalk—this isn’t something you have to write.
    • Click OK to save all of the settings.

Once everything is configured, enlisted, started and enabled, restart the BizTalk host instance. This will ensure everything is fresh in memory. Next, drop the individual ST/SE documents on the Receive Location and the batch will be produced once the Release mechanism triggers (for example, if you specified 2,500 as the number of transaction sets, then you must drop 2,500 individual documents).

Many Records per Individual ST/SE The next scenario to look at is mapping multiple records into individual ST/SE groups. Again, the sample source data shown in Figure 3 has two claims in it. The new goal is to take the individual claims (the <ns0:Claim> is the root node for a single claim) and map them both to a single target 837 schema. Once mapped, the data can either be delivered as is, in a single document with a single ST/SE, or it can be batched like the previous example into a document with multiple ST/SEs, each with multiple claim records.

Start with altering the stored procedure that you wrote (see Figure 1). Right now, it simply brings back all of the records in the database. You need to add a parameter to limit the number of records that come back. For example, if you want to be able to put 5,000 records into a single ST/SE, then you’ll need to grab only 5,000 records at a time from your source database. You can add additional parameters, such as “Trading Partner ID,” to further restrict what’s coming back and make the stored procedure reusable across multiple parties. Once added, you can modify the SQL Adapter settings to pass in the additional parameters. You’ll eventually want to set the SQL Adapter to run on a recurring cycle (every hour, for example), extracting 5,000 records each time.

Next, alter your mapping. Whatever mapping you put into place for the source to the target EDI schema now needs to be modified with a loop. In the case of the 837, for example, you’ll want to set a loop functoid with the source being the <ns0:Claim> (from the schema in Figure 3) and the target being the TS837_2000A_Loop, which constitutes the top level of a target claim.

Now that your mapping is complete, you can decide whether to set up the batching within the BizTalk Party or not. If you simply want to deliver the 5,000 claims in a single ST/SE in a single document, your work is done—just set up a Send Port and ship the data out using the EDISend pipeline. If you do want to set the batching, the configuration will be the same as what you worked through earlier in this article.

Batch Release Options

You have a number of options available to release your batches. As indicated earlier, the two most common are to release batches on a specific schedule and to release when a specific number of transactions has been reached. The scheduler allows for a variety of hourly, daily and weekly configurations. For example, if you need to deliver whatever records have queued up each day at midnight, regardless of the count, you would configure the batch to release on a daily schedule at midnight. The option to release a batch based on a specific number of transactions is easily configured—just specify what the number is. Additional batching options include releasing on a specified number of characters in an interchange and on an external release trigger.

The external release trigger can be triggered using a control message dropped on the message box.

Great Simplification

As you build out your batching process in BizTalk, you should ensure you’re building something that’s as simple as possible. A lot of BizTalk solutions are burdened by excessive orchestrations, schemas, referenced DLLs and other artifacts. Developers often will decide that they need to create a custom batching orchestration in order to handle what they feel is a unique situation. Always focus on the long-term feasibility of your solution. Can a developer six months from now look at what you were doing and understand how to work with it? In general, you can build a process to extract data from BizTalk and deliver it to various Trading Partners with either one or no orchestration, one schema representing the source data, one schema representing the target data and a map per party. If you find yourself creating more components than this, take a step back and reanalyze your solution. You’ll likely discover that you can do some great simplification.


Mark Beckner is the founder of Inotek Consulting Group LLC (inotekgroup.com). He works across the Microsoft stack, including BizTalk, SharePoint, Dynamics CRM and general .NET Framework development.

Thanks to the following technical expert for reviewing this article: Karthik Bharathy