SOA and Data

SOA and Data

It has been about a year since I started my current job which is trying to evangelize the architectural aspect of data in SOA. I thought this would be an appropriate time to share some of what I have learned.

The first thing I learned is that people doing architecture for SOA systems for the most part don’t really care about data. They spend most of their time thinking about WS-* standards, whether they need an ESB (or more commonly “what the heck IS and ESB?), what their message schemas should look like, and how to factor their services. When I mention data, the usual reaction is “Data? I have a DBA for that. I have important architect stuff to do so I can’t be concerned with data.”

At first I thought Service Broker would be a natural place to find common ground between data and SOA. After all, you don’t have to go too far down the SOA path before you realize that unless you build reliable, asynchronous, loosely-coupled services, your SOA architecture is going to have serious reliability problems and Service Broker brought reliable, transactional messaging to a whole new level of reliability and efficiency. What I found was a bunch of architects trying to figure out how to use WS-Transactions to build tightly coupled services to replace their tightly coupled objects.

I next ran into a lot of people architecting SOA systems to provide a common services interface to a lot of diverse back-end systems. I’ve talked to people who had over 100 systems that handle customer data for example. If you build a perfect customer service to wrap all these systems with a common schema for the customer record you have a single view of the customer right? The first time your user tries to change the phone number for Acme Rockets Inc. and gets back 80 records which may or may not be for the same customer, the single view of the customer loses some of its appeal. That’s how I got interested in Master Data Management. I really believe that accurate, unambiguous clean data is a prerequisite to an SOA project.

So why the lack of interest? My guess is it’s because data isn’t cool. New development paradigms, new tools, new infrastructures are all cool but data doesn’t change. One the other hand, that’s why data is important. It IS the part of your applications that doesn’t change no matter how many times you start over. A typical enterprise application may have changed hardware platforms, OS’s, communications protocols, development environments, and even database engines but chances are pretty good that the customer master hasn’t been replaced. What would your CIO say if you told him “our new receivables system isn’t compatible with the old on so we’re gong to throw the old receivables data away. People know what they owe. They’ll send us a check.” When did you ever see a headline that someone had stolen a laptop that contained the code for a new CRM system? The headlines happen when someone steals data. A bank can burn to the ground and recover nicely but if they lose all their account data they’re out of business. I once worked with a bank who found after a hard-drive crash that they couldn’t read any of their backup tapes. They were thinking about asking their customers to bring in their last statement so they could have some idea how much money people had when we figured out how to read one of their backups.

The point I’m trying to make is that while data isn’t as cool as web services and agile methods, if you don’t think about it early and often, you project is likely to fail.

I’ll climb down off my soapbox now. I would be very interested in any comment either for or against this view. Any war stories about huge successes or huge failures as a result of data issues are very welcome.