Azure Data Factory Data Flow activities stuck in queued status

Colton 1 Reputation point
2021-10-07T19:48:35.32+00:00

All the activities in my ADF pipelines are working correctly except for Data Flow activities, all of which become stuck in queued status and never leave it (one was running for 18 hours before I canceled it).

Each of the Data Flows is reading from a Azure Application Insights BLOB Storage account and the sinks are all Azure SQL Server or Azure Synapse instances.

When testing the connections (both sink and source) they all connect successfully, and in Debug mode the data preview tab shows data successfully at each step of each Data Flow. And it happens in both Integration runtime and a Debug cluster

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,600 questions
{count} votes

2 answers

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2021-10-15T05:35:07.55+00:00

    Hi All,

    From the support ticket analysis, it is noticed that the issue was related to large file processing. If smaller number of files are processed at a time, the data flows through successfully, so the resolution here is to grab data in smaller chunks and just loop through until it has all.

    Sharing this info as it can be beneficial to others who come across this thread.

    Thank you


  2. R. Kendall Glover 1 Reputation point
    2022-11-02T19:34:49.857+00:00

    One way to process data in smaller chunks:

    Assume you have one large data file, using a Data Factory Source

    You could set Sampling ON in that source activity, and set the Rows Limit by a passed job parameter via Add dynamic Content.

    You'd have to wrap your Data Factory Job in some sort of Iteration Activity (e.g. Until) in a called Pipeline. I'd probably check for the output of your Source Activity and, if the number is LESS than the value of the parameter you pass in, then break out of the Iteration Loop.

    One Caveat: This may NOT do well with "No Records Returned". May have to do some edge testing for that. (e.g. you process 1000 rows at a time, and there are exactly 1000 rows in the last batch, to the NEXT batch processes zero - does the Source activity return an output value of zero? Or does it not have any results at all?)

    0 comments No comments