ADF managed Airflow tasks fail without running or logs

Peter Yacoub 0 Reputation points
2024-07-21T12:41:08.4033333+00:00

Hi,

I have a managed Airflow instance inside Azure Data Factory. I am seeing a behaviour on a semi weekly basis where tasks scheduled for running suddenly fail with no logs. Whenever I retry any task (regardless of how heavy or light it is). It instantly fails with no logs, no record of a run even.

hitting the "/heath" endpoint returns that the metadata database, scheduler, and triggerer are all healthy and sending heartbeats.

My main guess is that the airflow workers/executors died at some point and were not replaced.

I looked through the metrics, the average CPU usage peak was at 50% and average MEMORY usage peak was at 70%.

I am running auto-scaling configuration with minimum of 3 and maximum of 8 "Small" nodes.

Right now the only option I seem to have is to restart the airflow instance as a whole and then manually go through each airflow task to restart it.

would appreciate any help or guidance on how to fix this issue.

Running Airflow 2.6.3 which seems to be the only version available.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,133 questions
0 comments No comments
{count} votes