ADF unable to convert Varbinary(MAX) sql server to VARBINARY Snowflake

Suraj 1 Reputation point
2021-08-25T18:58:21.38+00:00

Hello,
I am trying to load data from SQL Server table that has a column with varbinary(MAX) COL1 (sample data 0x2E6A70672C2E7064662C2E7478742C2E646F632C2E646F63782C2E7274662C2E706E670D0A0D0A0D0A46696C65466F726D61746573) to Snowflake using ADF pipelines. I have created a simple pipeline to using Copy activity to read from the source table and stage the data to the azure blob(using enable staging option) and then load to Sink(snowflake). The process errors out while loading the data with below error.

ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ERROR [22000] The following string is not a legal hex-encoded value: '0x255044462D312E360D25E2E3CFD30D0A33372030206F626A203C3C2F4C696E656172697A656420312F4C2032303539372F4' File 'a264d570-05de-4ef0-9028-9034ed0c360c/SnowflakeImportCopyCommand/data_a264d570-05de-4ef0-9028-9034ed0c360c_164ddc6c-8166-4d3e-aa2e-72c57fd6565b.txt', line 2, character 27 Row 1, column

I can cast data while reading to TEXT however since adf generated COPY statement in the backend, I am not able to convert TEXT back to BASE64 to load to Snowflake.

Any help is appreciated! Also if there is a similar question already answered then please point me to it because I was not able to find the link.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,703 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Aaron Morelli 1 Reputation point
    2021-08-30T22:18:14.453+00:00

    I hit this same problem, and after some research here are the 2 options I found:

    Option 1 If you can modify the source query against SQL Server, then you can change the format of the hex value. For example, on my sample binary data:
    SELECT
    BinaryCol1, --A hex value like "0x05CCBD7..."; This will not load into Snowflake
    DefaultStyle = convert(varchar(max), BinaryCol1, 1), -- A hex value like "0x05CCBD7..."; This will not load into Snowflake
    DesiredStyle = convert(varchar(max), BinaryCol1, 2) -- A hex value like "05CCBD7..."; This WILL load into Snowflake
    FROM dbo.tmp_binarytest;

    NOTE: this assumes that your load is using the default value of "HEX" for the load option BINARY_FORMAT. (See this for more: https://docs.snowflake.com/en/user-guide/binary-input-output.html#file-format-option-for-loading-unloading-binary-values). If you have that option explicitly set to one of the other values, UTF8 or BASE64, then the above approach won't work. (In the ADF Copy Activity, the BINARY_FORMAT option would be set on the "Sink" tab under the "Additional Snowflake format options" section, entered as a Key/Value pair).

    Option 2 Alternatively, if you can't modify the SELECT against the source system, you can load into a VARCHAR(16777216) column on the Snowflake side, and then manipulate the string to get rid of the leading "0x" and then convert to binary, like this:

    TRY_TO_BINARY(REPLACE(binarycol1,'0x',''), 'hex')

    or something similar, and save that resulting binary value somewhere else.

    0 comments No comments