Inconsistency in Null Values While copying data from delta lake to AzSQL

Question

Inconsistency in Null Values While copying data from delta lake to AzSQL

Rakesh Reddy 20

We build a ADF pipeline to copy data from delta lake to Azure SQL db. Below are Sink configuration in ADF copy activity

Write Behaviour: Insert

Bulk Insert Table lock: No

Autocreate Table: None

enable staging: Yes

when the data loaded to sql table, empty values in columns are populated as blanks. If the same data copied to azure sql in different environment, empty values are populated as Null.

In both the environments, table schema is exactly same. This is causing issue when the columns that are part of PKs are empty. Please suggest what might be the issue and how to overcome

ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator

2024-01-31T06:58:28.3033333+00:00

Hi, Just checking if below answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Thank you.

Answer accepted by question author

Answer recommended by moderator

1 additional answer

Your answer

ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator

2024-01-31T06:58:28.3033333+00:00

Hi, Just checking if below answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Thank you.

Answer 1

ShaikMaheer-MSFT 38,631 Microsoft Employee Moderator

Hi Rakesh Reddy, Thank you for sharing resolution details. This helps all community as well. Since you cannot accept your own answer, resharing answer here. Kindly consider marking it as Accepted answer.

We observed that the issue is because of databricks runtime version. We have used 12.2 runtime version in both environments and tried, then the Nulls are populated correctly in both environments.

Please consider hitting Accept Answer button. Accepted answers help community as well.

ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator

2024-02-02T05:39:17.4833333+00:00

Hi Rakesh Reddy, Just checking if you get chance to accept answer button. Accepted answers help community as well. Thank you.

Answer 2

Kanupuru Sai Rakesh 5

Hi Pinaki Ghatak/Maheer, thank you for the information. We observed that the issue is because of databricks runtime version. We have used 12.2 runtime version in both environments and tried, then the Nulls are populated correctly in both environments.

Answer 3

Pinaki Ghatak 5,690 Microsoft Employee Volunteer Moderator

Hello Rakesh Reddy

The issue you’re experiencing might be due to how Azure Data Factory (ADF) handles empty values during the data transfer process. In some cases, ADF might interpret empty values as blanks rather than NULL. This behavior can vary based on the specific configurations of your ADF pipeline and the settings of your Azure SQL database. Here are a couple of suggestions that might help you overcome this issue:

Derived Column Transformation: You can add a Derived Column step in your Data Flow to replace empty values with NULL. You can add a column pattern and use iifNull($$,toString(null())) to detect empty value in each column and replace it with NULL.
Expression in Mapping Dataflow: Another approach is to use an expression in your ADF mapping dataflow to convert blank or empty strings to NULL. The expression iif(column1=='',toString(null()),column1) can be used to check if a column is empty, and if so, convert it to NULL.

Remember to test these changes in a controlled environment before applying them to your production pipeline to ensure they work as expected. Let us know if this helps, by tagging this as answered.

ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator

2024-01-22T05:57:58.42+00:00

Hi Rakesh Reddy, Just checking if above answer helps. If yes, please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries. Thank you.
Rakesh Reddy 20 Reputation points

2024-01-22T17:17:26.4966667+00:00

Hi Pinaki Ghatak, Thank you for the information. The ADF configuration is same in two environments, does the ADF behave differently in different environments? Please let me know if this issue can be handled in copy activity itself without dataflows.
ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator

2024-01-30T10:00:34.06+00:00

Hi Rakesh Reddy, We have to use dataflows only here. Thank you.

Please consider hitting Accept Answer button. Accepted answers help community as well.

Share via

Inconsistency in Null Values While copying data from delta lake to AzSQL

1 additional answer

Your answer