Why (good) Xml is much better than plain text

There are many reasons, sure, and probably there are also reasons why plain text files can be better, but I would like to remark just only one reason, just because I fighting with it right now:

Xml is human readable

Or at least, it should be.

I’m dealing with the HL7 standard for healthcare. HL7 files are text files with some strange delimiters such ^ and |. Luckily we can use the BizTalk HL7 Accelerator, that allow us to abstract from the HL7 details.

A sample of an HL7 file:

MSH|^~\&|REG|MCM|BTS||199601121005||ADT^A04|000001|P|2.2
EVN|A04|199601121005||01||199601121000
PID|||191919^^^MYHOS^MR~123-45-6789^^^USSSA^SS|253763|SMITH^JOHN^Q||19560129|M|||123MAIN^^BUFFALO^NY^98052^""||(123)555-0100||S|M|10199925^^^MYHOS^AN|123-45-6789
PD1|S|F|NormalString^A^+1^-1^ISO^simpletext&Test&HCD^GI^simpletext&NormalString&ISO^I|NormalString^Test&Test^Test^Test

^Test^Test^AE^simpletext^simpletext&Test&ISO
^P^NormalString^M10^MC^simpletext&NormalString&HCD^A|N|simpletext|I|I|N|NormalString^+1^M11^

simpletext&NormalString&L,M,N^RRI^simpletext&
NormalString&HCD|NOVALUE^NormalString^Test^Test^NormalString^Test|N
PV1|1|I|2000^2012^01^hey&test&DNS^test^test^test^test^test||||004777^MILLER^CONNIE^A.|||SUR||||2|A0

Where is the Patient Name? is “the substring between the fifth and the sixth | (pipe), in the third line (the line starting with PID). And remember, spaces are represented as ^(strange little hat)

The HL7 Accelerator comes with Xsd schemas to map these flat files. A sample message type ADT A04 (the above) looks something like this (just a small piece):

<ns0:ADT_A04_22_GLO_DEF xmlns:ns0="https://microsoft.com/HealthCare/HL7/2X">
<EVN_EventType>
<EVN.1_EventTypeCode>A04</EVN.1_EventTypeCode>
<EVN.2_DateTimeOfEvent>199601121005</EVN.2_DateTimeOfEvent>
<EVN.3_DateTimePlannedEvent>199601121000</EVN.3_DateTimePlannedEvent>
<EVN.4_EventReasonCode>01</EVN.4_EventReasonCode>
</EVN_EventType>
<PID_PatientIdentification>
<PID.1_SetIdPatientId>191919</PID.1_SetIdPatientId>
<PID.2_PatientIdExternalId>
<PID.5_PatientName>
<PN.0_FamiliyName>Doe</PN.0_FamiliyName>
<PN.1_GivenName>John</PN.1_GivenName>
</PID.5_PatientName>
[…]

we still deal with HL7 codes and semantic structure, but it’s much easier to work the Patient Name. It's located in “the FamilyName element under PatientIdentification” :-)