ADF - Activities towards storage accounts gets stuck in queue if through COPY activity or Data Flow

Mikkel 0 Reputation points
2024-10-03T12:53:44.9366667+00:00

Hey there,

We are having timeout issues in Azure Data Factory with exchanging data with our storage accounts, both blob storage and Data Lake Gen2. The timeout occurs in

  1. Copy activities (both read and write)
  2. Data flows that connect to the storage accounts.

The timeout comes both for an Azure Blob Storage account, and for a Data Lake Gen2. For Gen2, the error is "user configuration issue", while for Blob Storage we just get a timeout with no further information. The timeout started happening around 3.44 AM UTC on Monday morning. It has been consistent since then.

The errors...

Source of error Error message (bold emphasis by me to remove context)
Data flow trying to read json files Job failed due to reason: at Source 'source1': Path /<path>/*.json does not resolve to any file(s). Please make sure the file/folder exists and is not hidden. At the same time, please ensure special character is not included in file/folder name, for example, name starting with _
Copy activity trying to read excel file or csv files User's image

|

We have tried...

  1. To cancel all running pipelines, regardless of IR in use.
  2. To recreate the Linked Service using a different authentication method.
  3. To test the connection for the Linked Service.

All of this to no avail. Nothing has changed, and no useful information found. We checked the Azure Health Services, with no reported issues in our region (Norway East).

We have also tried some other ways of interacting with an Azure Storage account through ADF.

  1. Get Metadata activity. This works just fine.
  2. Delete activity. This works just fine, both with an without logging enabled. Including writing the log to a storage account.
  3. Write data to a file using the Web activity. This works just fine.

All of the above using one of the datasets that get the timeout.

We find no pull requests from within two days of when the issues started occurring.

Any thoughts?

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,149 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,679 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 24,531 Reputation points
    2024-10-03T19:49:29.1533333+00:00

    Ensure that the ADF has the correct permissions to access the storage accounts. Check the Access Control (IAM) settings for both Blob Storage and Data Lake Gen2.

    Verify that the storage account's firewall settings allow access from ADF's integration runtime (IR). You might need to adjust the networking settings.

    Check the ADF activity logs for any additional error messages or details related to the timeouts. This can sometimes provide more context about what's going wrong.

    Use the "Monitor" feature in ADF to see if there are any insights into the pipeline runs or activity executions.

    If you're using a self-hosted integration runtime, try switching to the Azure integration runtime or vice versa. This can sometimes alleviate connectivity issues.

    If the issue persists after trying the above steps, it may be beneficial to raise a support ticket with Azure Support.

    Provide them with the error messages, logs, and a detailed description of the issue.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.