February 2013

Volume 28 Number 02

Patterns in Practice - Data Design for Adding Functionality to a Class

By Peter Vogel | February 2013

In my last column, I described a common business problem: A SalesOrder with multiple OrderLines, with each OrderLine specifying a Product being bought by a Customer and the SalesOptions the customer has chosen to apply to that OrderLine/Product combination (e.g. giftwrapping, expediting). Those SalesOptions affect how that OrderLine/Product combination will be processed, including calculating the price for that purchase. In that column, I looked at a couple of patterns that might support a solution that was maintainable, extendable, testable and understandable. In the end, I decided that calculating the price should be handled by the decorator pattern because it would allow the SalesOptions to interact as required by the organization’s business rules (e.g. expediting increases the price of the Product by a set percentage after all other discounts are applied and ignoring any costs associated with giftwrapping). To deal with the rest of the processing required by the SalesOptions, I decided my best choice was to implement some version of the Roles pattern.

This column will look at the data design required to support implementing the solution.

Basic Tables

I’m old fashioned enough to begin to develop any solution by designing the database tables that I’ll need in my relational database (of course, if this organization had enough sales orders they might need a big data solution—but that isn’t the case here). Obviously, I’ll need a table listing valid SalesOptions. The primary key for this SalesOptions table is the SalesOptionId and the table has at least one other column: the SalesOption description, which is displayed in the user interface when the user is selecting or reviewing the SalesOptions for an OrderLine. Any data about the SalesOption that doesn’t vary from one Product to another will also go in this table.

Because not all SalesOptions can be applied to all Products, I also need a table of valid SalesOptions and Product combinations (the company sells both “things” and services, for instance, and you can’t gift wrap a service). This ValidSalesOptionsForProducts table would have a primary key made up of two columns: the ProductId and SalesOptionId. The table might have some additional columns to hold data related to the relationship between a particular Product and SalesOption.

If there’s no data associated with a Product/SalesOption combination, however, there’s another data design possible. If most SalesOptions apply to most Products, it would be more efficient to create an InvalidSalesOptionsForProduct table that lists just the Products and SalesOptions that can’t be combined. If the organization has thousands of products, a table of exceptions would be more efficient than a table of allowed combinations.

And before I get tons of comments about using natural keys/real data as the primary keys of my table: You’re perfectly welcome to give the ValidSalesOptionsForProduct table a meaningless primary key (a GUID or Identity key of some kind) and apply a unique index to the combination of SalesOptionId and ProductId. SQL Server’s hashing mechanisms for building an index will almost certainly give your better performance whenever you need to use that meaningless key as the foreign key of some other table. But, for this discussion, I don’t need that meaningless primary key so I will ignore that option. That also applies to the next table that I’ll discuss.

For any particular OrderLine, I’ll also need a table to record the SalesOptions that have been applied to it. That SalesOptionForOrderLine table will have a primary key, or unique index, consisting of the OrderId, the OrderLineId, and the SalesOptionId.

Supporting SalesOptions

Finally, each SalesOption will have data associated with its application to a particular Product/OrderLine. For instance, if the user selects the expediting SalesOption, the user needs to select the level of expediting (NextDay or Urgent); for the giftwrapping SalesOption the user will need to specify the type of giftwrapping.

There are at least two different data designs that would support recording this information. One design is to simply add the required columns to the SalesOptionForOrderLine table: a GiftwrapId column for the giftwrapping option and the ExpediteLevel for the expediting option. For any particular SalesOption most of those columns would be Null; in other words, for the giftwrapping SalesOption the GiftwrapId column will have a valid value but the ExpediteLevel column will hold a Null value.

In the other design, each SalesOption would have its own SalesOptionData table (including ExpediteSalesOptionData and GiftwrapSalesOptionData). Each of these tables would have as its primary key the SalesOrderId and OrderLineId and the columns required by the SalesOption (e.g. the ExpediteSalesOptionData table would have an ExpediteLevel column, the GiftwrapSalesOptionData table would have GiftwrapId column).

There’s no universally right answer here—the right answer will depend on the way the business works. For instance, having a separate SalesOptionsData table for each SalesOption requires me to use one of two data access plans when processing the OrderLines in a SalesOrder.

  • For each OrderLine, make a separate trip to the database to retrieve the row from the relevant SalesOptionData table (more trips = worse performance).
  • Fetch all of the rows from all of the relevant SalesOptionsData table by joining the OrderLine to all of the SalesOptionsData table with outer joins (more outer joins = worse performance). This solution also requires me to rewrite this query every time a new SalesOption is added to the application.

If the number of OrderLines being processed at any one time is small (such as one SalesOrder’s worth of OrderLines which is, typically, less than six OrderLines), I could live with the performance hit that comes with either data access plan. I could also live with either data access plan if any particular part of the organization only needs to process a small number of specific SalesOptions (if the shipping department only needs to retrieve the information for the Expedite SalesOption). However, if the number of OrderLines being processed at any one time is large ( if I process many SalesOrders at a time) or if there’s a part of the organization that needs to handle all of the SalesOptions applied to an OrderLine then the performance impact could be crippling.

Looking at the business, I can see that any part of the organization will typically only be interested in a few specific SalesOptions applied to a Product/OrderLine combination. The one exception is the order taking application—however, it only works with one SalesOrder’s worth of OrderLines at a time. However, there are several places in the organization where many Orders are processed at once. The shipping department, for instance, prepares a day’s worth of shipping at a time to support combining shipments and reducing costs. That second fact drives me to adding nullable columns to the SalesOptionForOrderLine table.

I admit to having another reason for adding nullable columns to the SalesOptionForOrderLine table. I also know that the amount of data associated with a SalesOption is typically small. If I used individual tables I’d end up with tables that have only one or two columns (other than their primary key columns). I have a visceral objection to that though I’m not sure that I could justify it.

Putting all of this together, it means that, on the data side, adding a new SalesOption consists of adding:

  • A row to the SalesOptions table
  • Multiple rows to the ValidSalesOptionsForProduct table
  • Additional columns to the SalesOptionForOrderLine table

Next Steps

And, of course, adding a new SalesOption requires creating the appropriate role object to hold the code for processing the SalesOption. So that’s next month’s column—the object model.

One of the things that you may have noticed is that I’ve frequently referred to the way that the organization works both in the design phase and in the implementation phase (for me, the implementation phase includes deciding on the details of the design). That’s another one of the assumptions of this column: conditions alter cases. Your deep understanding of how your organization works is critical not only in selecting the right pattern for your application but also in deciding what makes sense when it comes to implementing that pattern. That’s one of the reasons that I appreciate patterns so much: they’re a support rather than a straightjacket and allow me to tailor the solution to my clients’ needs.

One of the comments made by a reader on the original column suggested that the OrderLine should be responsible for managing the SalesOptions rather than the Products. That’s a good question. Certainly, the data design that ties OrderLines to SalesOptions suggests that it’s a reusable choice. But it really leads to more interesting question for this column: What would be the basis for deciding where to put control of the SalesOption role objects? The question is made a little harder to decide because, in this organization, an OrderLine always has a Product assigned to it; disentangling the two business entities is hard to do.

The reason that I felt that the Product should manage the SalesOptions was because of the necessity of validating SalesOptions against the ValidSalesOptionsForProduct table—I assumed would be handled by code in the Product class and that the rest of the SalesOptions code would go in the Product also. However, I’m not sure that’s a compelling argument; The code in an OrderLine class could validate the Product associated with the OrderLine as easily as the Product class could because an OrderLine always knows what Product is associated with it.

One way to make the decision would be to look at the way that the business is run. If, after assigning a SalesOption to a Product/Orderline combination, is it possible to move that Product to another OrderLine in the SalesOrder? Or to change the Product assigned to the OrderLine? If either of those changes is possible, what happens to the SalesOptions? Do the SalesOptions follow the Product to another OrderLine or stay with the OrderLine? If you replace the Product on an OrderLine with a new Product, would it always retain the SalesOptions assigned to the original Product? The SalesOptions stay with the OrderLine, it suggests that the OrderLine is responsible for the SalesOptions.

Another way to answer the question is to look at how the classes will be used elsewhere in the business’ processes. If I know that, somewhere in the organization, Products and SalesOptions needed to be processed even when the Products aren’t associated with an OrderLine then I will have a compelling reason for keeping the responsibility of processing the SalesOption with the Product.

I do have one scenario where Products and SalesOptions are processed independently of an OrderLine: Some SalesOptions have different processing, depending on which Product the SalesOption is assigned to and regardless of the state of the OrderLine involved. For instance, expediting a “thing” is different from expediting a “service”; expediting a thing means shipping it earlier, while expediting a service means the team delivering the service goes to the customer’s site earlier. As a result, when an application asks for the Expediting role object for a Product, different Products will provide a different Role object.

There are other solutions to this problem, of course. For instance, should delivering a thing and a service earlier both be called “expediting”? Having different Strategy objects (one for things and one for services) might also resolve the problem. However, rather than make the OrderLine responsible for determining which Expediting role object to use with a Product or create a complex Expediting role object that can handle all Products, I’ll turn the responsibility for returning the right role object over to the Product.

Peter Vogel is the principal system architect in PH&V Information Services, specializing in SharePoint and service-oriented architecture (SOA) development, with expertise in user interface design. In addition, Peter is the author of four books on programming and wrote Learning Tree International’s courses on SOA design ASP.NET development taught in North America, Europe, Africa and Asia.