SAP CDC connector doesn't work properly with special characters in TEXT fields

Dmytro Honcharuk 50 Reputation points
2023-06-21T11:05:24.5133333+00:00

We try to ingest data from SAP system using SAP CDC connector with SLT. In some of the tables in SAP we have TEXT columns which may consist special characters. In particular, we have newline '\n' character in such columns. And when data comes to Azure the input stream move the data after this character to another row. So even in Derived Column transformation we cannot replace this character or to perform some similar operation.

In staging file that we receive the data is moved to another row separated by this newline character.

When performing Copy Activity from SAP HANA for replicated table we receive this data in proper way as a string with this special character inside it - this is the expected behavior that we try to have with SAP CDC as well.

Is there any way how we can avoid such situation or somehow to preprocess the data?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,644 questions
0 comments No comments
{count} votes

Accepted answer
  1. QuantumCache 20,366 Reputation points Moderator
    2023-06-22T04:00:20.8966667+00:00

    Hello @Dmytro Honcharuk, Thanks for reaching out on this forum,

    Updated 7/14/2023: From Original Poster: Resolution: @Dmytro Honchar

    Was able to resolve it by adding ‘enableMultiLineRow: true’ in the mapping data flow code.
    The solution for this will be adding “ " enableMultiLineRow:true,", ” into the dataflow json code, on source properties inside scriptLines. In this case string is interpreted as expected

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I have reposted your solution in case you'd like to "Accept " the answer.

    Please try considering some of the below suggestions!

    1. Check the encoding of the special characters in the TEXT fields. Make sure that the encoding is supported by the SAP CDC connector. You may need to convert the encoding of the special characters before ingesting the data.
    2. Try using a different data type for the TEXT fields in SAP. For example, you could use a VARCHAR field instead of a TEXT field. This may help avoid issues with special characters.
    3. Consider using a pre-processing step to clean up the data before ingesting it into Azure. For example, you could use a script to replace the newline character with a different character that is supported by the SAP CDC connector.

    You may also let us know if you need further help in this matter and we are more than happy to help further!

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Dmytro Honcharuk 50 Reputation points
    2023-08-01T07:36:07.6166667+00:00

    The solution for this will be adding “ " enableMultiLineRow:true,", ” into the dataflow json code, on source properties inside scriptLines. In this case string is interpreted as expected

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.