Azure Data Lake Gen2 storing JSON file results in encoding content as NUL (Ascii 0) characters

lakshmireddy kondapureddy 1 Reputation point
2021-03-15T10:14:09.927+00:00

We are creating json file in Azure Data Lake Gen2 from java application. File write success but content is encoded with one liner of NUL (Ascii 0) characters. This is happening intermittently.

anything to do with storage account?

77727-capture.png

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,424 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sumarigo-MSFT 45,406 Reputation points Microsoft Employee
    2021-03-24T17:02:18.503+00:00

    @lakshmireddy kondapureddy
    Just checking in to see if you have had a chance to see the previous response. Could you share the above required information to understand/investigate this issue further?
    We had a similar issue with creating .json files using Powershell and the Out-File command.
    It would produce the file just fine (so we thought), but when we tried to parse it from a bash command in a YAML pipeline it threw an error saying it’s not valid json.
    We ended up finding out that the Out-File command added some non-printable characters to the start of the file. Basically we needed a specific encoding on the file.

    We had to include an encoding parameter with our out-file command to encode it the way we needed it (UTF8).
    Perhaps you would need to explicitly state the encoding needed when creating the file or writing to the file?
    This might not explain why it’s intermittent unless for some reason the default encoding is different for the different instances running this code.

    0 comments No comments