Designing HealthVault’s Data Model

It’s been about a month now since we released HealthVault and we heard a lot of great feedback from the industry. Pretty much everyone in Health has an opinion on HealthVault :-)

I also saw some interesting debate regarding how HealthVault addresses the big elephant in the room: compliance with existing standards.

I have been involved in the standards community in healthcare for a long time and contributed first hand to several of the standards that are adopted today, including the ASTM CCR, HL7 and the IHE Interoperability Profiles. We have demonstrated in the past strong support for standards in healthcare and we are committed to the idea of interoperability based on standard protocols and data formats.

I talked to Sean Nolan, our Chief Architect for HealthVault, about the philosophy behind the data model design and I asked him to give a brief explanation. Here is what he had to say:

Wherever possible, we are using existing standards both for interchange in and out of HealthVault as well as within it. Many of our data types draw directly from standards such as the ASTM CCR, the IndivoHealth project and soon the HL7 CDA/CCD.

Some of the data types we needed in order to support our partners’ applications where not readily available in the standards community. In those cases, for example for the “aerobic exercise” data type, we have worked in concert with clinical and application partners to make sure that we capture the right information. We are not the domain experts – our partners are, and we are leveraging their expertise to help us build up the dictionary of HealthVault types.

Our decisions around data type definitions are driven by four key principles:

1. Interoperable. When designing our information model, we try to do our best to make our data types interoperable with industry standards in actual use. Each individual data type generally represents a superset of the correspondent industry standards data type. This way we enable our partners to more easily take the information stored in HealthVault and populate a standard ASTM CCR or HL7 CDA XML document or to take an existing ASTM CCR or HL7 CDA XML document and populate the atomic HealthVault data.

2. Inclusive. When designing the HealthVault data model we had to strike a balance between fully structured data and unstructured information. This is in line with a number of the industry standards that allow, for example, dates to be represented as the string “when I was a kid”. Our types are designed to be as inclusive as possible - with the ability to capture structure when it is available, but still take in the data when structure is missing. Our use of “coded values” is one example - we allow simple text strings for things like lab results, but if LOINC codes or similar are available we handle them as well. This does make life somewhat harder for application developers - but ultimately we believe it is well worth the added complexity. This philosophy is also expressed in other parts of the system - for example, we support the ability for people to fax information into their records. An image of your immunization history isn’t as good as structured data - but it sure is better than not having it at all.

3. Just in Time. If the data types you see in the HealthVault today seem a bit arbitrary, it is due to how we approached their design with our early partners. Our data model is growing as we work with partners fluent in various domains. For example, we got help on our fitness types from Polar, and learned a great deal from J&J Lifescan when it came to blood glucose measurements.

4. Independent. As much as possible, we have tried to keep application development simple by eliminating relationships across data items. For example, for medication we store the information on the prescribing physician in the data item rather than as a pointer to another data item describing the physician. Managing data integrity across partners would be a huge problem if we had a real normalized schema behind the HealthVault system. Our goal has been to allow expression of those connections but never rely on their existence for data integrity.

Our types also allow each vendor to add “extensions” of their own making to item data – so to the extent that we are missing certain fields, they can be added – and the industry can rally around those extensions if it makes sense. We’re also working on a process for partners to submit these extensions for inclusion in the HealthVault base types.

We appreciate the richness of the clinical formats developed by the standards community. This is the reason that when we take data from external sources, we keep the original available as a single package that can be shared and managed just as any other type can be. In addition to this, we support through our API the ability to extract of the component pieces of those items and - when appropriate -store them into more discrete types.

I strongly suggest you download our SDK from the HealthVault MSDN Developer Center and have a look at the data types. Also, if you spend enough time looking at the SDK, you’ll see “traces” :-) of how we support ASTM CCR and HL7 CDA CCD:




I hope this helps explain our thinking and some of the principles behind the design HealthVault, please keep providing us with precious feedback!