ADF Pipeline intermittently failing - why?

Anon101 51 Reputation points
2024-07-29T08:56:46.4466667+00:00

Hi,
Our organisation is new to Azure Data Factory (ADF) and we're facing an intermittent error with our first Pipeline. Being intermittent adds that little bit more complexity to resolving the error.

The Pipeline has two activities:

  1. Script activity which deletes the contents of the target Azure SQL Server database table that is located within our Azure cloud instance. This is the activity the error message (below) seems to comment on, however this may simply be due it being the first activity in the Pipeline and nothing more meaningful than that.
  2. Copy data activity which simply copies the entire contents from the external (outside of our domain) third-party source SQL View and loads it to our target Azure SQL Server database table.

With the source being external to our domain, we have used a Self-Hosted Integration Runtime.

The Pipeline executes once per 24 hours at 3am each morning. I have been informed that this timing shouldn't affect/or by affected by any other Azure processes we have. For the first nine days of Pipeline executions, the Pipeline successfully completed its executions. Then for the next nine days it only completed successfully four times. Now it seems to fail every other time. It's the same error message that is received on each failure - the received error message is below (I've replaced our sensitive internal names with Xs).

Operation on target scr__Delete stg__XXXXXXXXXX contents failed: Failed to execute script. Exception:
''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'XX-azure-sql-server.database.windows.net', Database: 'XX_XXXXXXXXXX_XXXXXXXXXX', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.,Source=Microsoft.DataTransfer.Connectors.MSSQL,''Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.,Source=Framework Microsoft SqlClient Data Provider,''

To me, if this Pipeline was incorrectly configured then the Pipeline would never have successfully completed, not once. With it being intermittent, but becoming more frequent, suggests it's being caused by something other than its configuration, but I could be wrong - hence requesting help from you.
In addition to the above description, There is a bit of a trend in the first Pipeline of the day is the one failing. Let's say I take the above Pipeline and schedule it at 4am, 6am, 8am, and 10am. If the Pipelines were to fail that day then it would be the 4am execution that fails, the remaining Pipelines would complete successfully. It's all very odd. Just to add, each Trigger will execute successfully, it's just the Pipeline (that the trigger is kicking off) that doesn't complete successfully.

Is anyone able to shed light on what's happening here?
Thanks.

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,690 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AzureAce 102 Reputation points
    2024-07-29T10:06:33.8133333+00:00

    Hi,
    The fact that the pipeline worked initially and now fails intermittently suggests a configuration issue is less likely and some potential causes for the intermittent failures.

    Error Message Breakdown:
    Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException: This indicates an issue during data transfer using a Microsoft service.

    1. Message=Cannot connect to SQL Database: This is the core problem - the pipeline can't establish a connection to the target Azure SQL Server database.
    2. Server: This specifies the server name of your Azure SQL Database.
    3. Database: This indicates the specific database name within the server.
    4. User: This section is empty, suggesting a potential missing or incorrect username/password configuration.
    5. Check the linked service configuration is correct: This points to an issue with the connection information stored within your Azure Data Factory.
    6. Make sure the SQL Database firewall allows the integration runtime to access.: This suggests the Azure SQL Database firewall might be blocking access attempts.
      1. Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.: This nested error indicates a connection timeout issue, meaning the script couldn't reach the database within a set time frame.
      Causes for Intermittent Failures:
      1.Azure SQL Database Firewall:The firewall might be intermittently blocking access attempts from the Azure Data Factory integration runtime. Double-check your firewall rules to ensure the integration runtime IP address or Azure Data Factory service endpoint has access.
      2.Integration Runtime Issues: The integration runtime could be experiencing temporary resource limitations or errors. Try restarting the integration runtime or using a different integration runtime if available.
      3.Integration Runtime Issues: The integration runtime could be experiencing temporary resource limitations or errors. Try restarting the integration runtime or using a different integration runtime if available. Troubleshooting Steps:
      1.Review Linked Service Configuration:
      Verify the username, password, and server details stored in your Azure Data Factory linked service for the Azure SQL Database connection. Ensure they are correct and up-to-date.
      2.Check Azure SQL Database Firewall Rules: Review your Azure SQL Database firewall rules and confirm they allow access from the Azure Data Factory integration runtime.
      3.Monitor Network Logs and Azure Monitor: Use Azure Monitor or network logs to identify any connectivity issues during failure times. This can help pinpoint network-related problems.
      4.Monitor Azure Service Health: Keep an eye on the Azure Service Health dashboard for any reported issues affecting your Azure SQL Database service region.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.