Converting XML to JSON with Azure Logic Apps
With Logic Apps you can quickly build small workflows to perform common tasks. An interesting feature of Logic Apps is its ability to run serverless, so you only pay for the actual compute power needed to execute your workflow. This feature is perfect for what I set out to do: convert XML files to JSON in bulk.
My Test Setup
For testing purposes I used a OneDrive folder as source, and another OneDrive folder as destination. The LogicApps trigger for OneDrive executes the workflow anytime a new file is added to the folder. Technically it works on a interval that you configure, for example every minute, at which point it scans for all the new files since the last time the trigger executed. The view below shows the general idea.
Converting to JSON
As you can see in the image above, the content of the new file is created using the json function available in Logic Apps. Based on the documentation in Workflow Definition Language schema for Azure Logics Apps, this function accepts string or XML input. In my first attempt I used the following function: json(triggerBody()) , where triggerBody() is shorthand for trigger().outputs.body, which represents the file contents. To my surprise this yielded the following error:
InvalidTemplate. Unable to process template language expressions in action 'Create_file' inputs at line '1'
and column '2295': 'The template language function 'json' parameter is not valid. The provided value
'<?xml version="1.0"?> <body> <item attr1="1" attr2="2" /> <item attr1='1' attr3='3' /> </body>' cannot be
parsed: 'Unexpected character encountered while parsing value: <. Path '', line 0, position 0.'. Please see
https://aka.ms/logicexpressions#json for usage details.'.
Apparently the json function is stumbling over the '<' character (the red character in the error message), even though the documentation says it supports XML input. My first assumption was that this had something to do with an invisible character, so I tried different editing tools to create the XML, but that didn't help. Jeff Hollan on the Logic Apps team helped me understand the actual problem is the way Logic Apps interprets the file content of a file on OneDrive. Hence, Logic Apps doesn't understand you are feeding XML to the json function unless you explicitly tell it what it's dealing with. To do so, you need to use the xml function to cast the input to XML. The correct expression therefore is
json(xml(triggerBody()))
Using a different source/destination
My test setup was with OneDrive, but the expression will work with any input/trigger that doesn't stumble over XML content. You could just as easily use blob, queue, or table storage, or another method such as email, FTP, or Azure Service Bus.