Dataflow works fine in DEV Azure Data Factory , but fails with errors in PROD

Question

Dataflow works fine in DEV Azure Data Factory , but fails with errors in PROD

Ulrike Pick 40

I am working in an environment of two ADFs (PROD and DEV), which are supposed to be identical copies.

I have a bunch of copy activities running fine on both, but three (a,b,c) of them fail in PROD while running fine in DEV.

a.-c. - copy jobs from (SAP) SQL Server --> Azure SQL-DB (Copy activity)

a. just over 250.000 rows;

ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to create interim table, error message: The data type ntext cannot be used as an operand to the UNION, INTERSECT or EXCEPT operators because it is not comparable., target table has certain type which is not supported in Upsert.,Source=Microsoft.DataTransfer.ClientLibrary,'
The source and target datasets include one ntext field.
Works fine in DEV ADF.

b. 3.77 m rows / c. 3.85 m rows

ErrorCode=SqlOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A database operation failed. Please search error to get more details.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,... The statement has been terminated.,Source=.Net SqlClient Data Provider,SqlErrorNumber=-2,Class=11,ErrorCode=-2146232060,State=0,Errors=...

As far as I can tell from manual comparison, there is no difference in timeout or other settings.

I have absolutely no idea what's wrong. Any suggestions? Is there a "deep compare" feature between factories? I work in GitHub mode, the JSON comparison showed no differences.

Thanks in advance.

3 answers

Your answer

Answer 1

It seems you are facing issues with your Azure Data Factory (ADF) copy activities in the PROD environment that do not occur in the DEV environment. Here are some suggestions based on the error messages and the context provided:

Data Type Issues: The error for job a indicates that the ntext data type cannot be used in certain SQL operations like UNION, INTERSECT, or EXCEPT. This is likely due to the fact that ntext is not a comparable type. You may want to consider changing the data type in your source or target database to a supported type such as nvarchar(max) or varchar(max). This change might resolve the issue in the PROD environment.
Timeout Issues: For jobs b and c, the error indicates a database operation failure, which could be related to timeouts. Even if you have checked for timeout settings, it might be worth reviewing the performance of the source SQL Server and Azure SQL Database during the copy operation. High load or resource constraints could lead to timeouts. You can also try increasing the Query timeout and Write batch timeout settings in your ADF copy activity to see if that resolves the issue.
Data Volume: The volume of data being copied (3.77 million and 3.85 million rows) could also be a contributing factor. Consider breaking down the data into smaller batches if possible, or optimizing the query to reduce the amount of data being transferred at once.
Environment Differences: Since you mentioned that the environments are supposed to be identical, ensure that there are no differences in the underlying infrastructure, such as performance tiers, resource availability, or network latency that could affect the performance of the copy activities in PROD.
Error Logging: Enable detailed logging in your ADF pipeline to capture more information about the errors. This can help in diagnosing the specific cause of the failures.
Deep Comparison: While there isn't a built-in "deep compare" feature in ADF, you can manually check the configurations of datasets, linked services, and pipeline settings in both environments to ensure they are identical. Additionally, using tools like Azure Resource Manager (ARM) templates can help in comparing configurations.

If the problem persists, consider reaching out to Azure support for further assistance, as they can provide more targeted help based on your specific setup.

References:

Answer 2

Hey Ulrike Pick,

It looks like you're encountering issues with your Azure Data Factory (ADF) Data Flows when running in your PROD environment, while they work just fine in DEV. Let's break down the issues you're facing.

please correct me if i mis understood the issue.

You have three copy activities (a, b, c) that are failing in PROD but succeed in DEV. Here's a look at the specific issues:

Activity a: The error message indicates a problem with the data type ntext during an interim table creation, which cannot be used with the UNION, INTERSECT, or EXCEPT operators.
Activities b and c: They are failing due to a SqlErrorNumber=-2, which typically indicates a timeout or resource issue. These activities process over 3 million rows, which could contribute to resource constraints.

Here are some troubleshooting steps you can take:

Check Integration Runtime Configurations:
- Verify that the Integration Runtime (IR) configurations in both DEV and PROD are identical. A mismatch here can lead to performance issues.
- Ensure that you're using the correct IR for the Data Flow activity as there might be a difference in the configurations.
Review Data Types:
- Since activity a fails due to using ntext, consider changing the column data type in your SQL Server source to something more compatible, such as nvarchar(max), if feasible.
- For activities b and c, inspect the data types of the source and destination tables to ensure compatibility.
Evaluate Performance Settings:
- Given the large dataset sizes, check if the cluster size/type selected for the Data Flows in PROD is sufficient. Sometimes increasing the size of the cluster can resolve timeout issues.
- Review the partitioning of your data to ensure no single partition is too large for the memory allocated.
Retry Mechanism:
- Since you’re facing transient errors, consider setting retries in your pipeline configuration for these activities. It can help in scenarios where resources are temporarily insufficient.
Compare Settings Deeply:
- If you’re unable to spot differences based on manual checks, consider using Azure’s Mapping data flow diagnostics to identify hidden discrepancies.
Run Diagnostics:
- Utilize ADF’s built-in diagnostics to check for known issues. This can provide insights into problems that you may not have anticipated.

I hope these suggestions point you in the right direction! If you have any further details or follow-up questions, feel free to share.

Reference documents

Thanks!

Kalyani

Answer 3

Hi Kalyani,

thanks for your answer.

Troubleshooting steps:

Check Integration Runtime Configurations:
- only difference: "parallel jobs"=14 in PROD, 4 in DEV
- Same IR used for the Copy jobs
Review Data Types:
- identical data types work fine on DEV.
Evaluate Performance Settings:
- How can I check if the cluster size/type selected for the Data Flows in PROD is sufficient?Where can I increase it?
- I will try partitioning, but still: works fine and fast in DEV ADF without partitions.
Retry Mechanism:
- I will consider this.
Compare Settings Deeply:
- Mapping data flow is a broken link
Run Diagnostics:
- Copy Activity diagnostics doesn't show any problems.

Share via

Dataflow works fine in DEV Azure Data Factory , but fails with errors in PROD

3 answers

Your answer