How to automatically rerun failed pipeline in ADF?

Anuganti Suresh 200 Reputation points
2024-03-20T17:08:30.3433333+00:00

I have pipeline PL_Final like...

Lookup --> Execute Pipeline1 (data ingestion) --> Execute Pipeline2 (data flow) --> Execute Pipeline3 (data flow) --> Execute Pipeline4 --> Execute Pipeline5 --> Stored Procedure

Schedule trigger created on daily basis for the pipeline.

  • Pipeline got failed at Execute Pipeline3.
  • Lookup took 1 min and Execute Pipeline1 took 3.5 hrs for execution.

Query:

  1. How to automatically rerun failed pipeline from where it got failed?
  2. Manually run the pipeline from Monitor | Pipeline Runs | PL_Final | rerun from failed activity. Since pipeline got failed at Execute Pipeline3, it suppose to start from that activity and to be skip by first two activities. activity1(Lookup) got skipped but Unfortunately activity2 (Execute Pipeline1) running again. So problem is that since on first run it took 3.5 hrs to complete activity and again second run also taking same time. Why it is running repeatedly and how to handle this situation?

User's image User's image

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,710 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Sina Salam 4,051 Reputation points
    2024-03-20T22:26:16.53+00:00

    Hello @Anuganti Suresh

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Regarding your questions, on how to automatically rerun failed pipeline in ADF and why it is running repeatedly and how to handle this situation.

    To address this situation, you can utilize Azure Data Factory's built-in functionality to handle retries and resume execution from the failed activity. Answers to question one will resolve all.

    Technically, couple of things you can do are not more than these three listed here:

    1. You can configure a retry policy for Execute Pipeline3 activity to automatically retry in case of failure. You can specify the number of retry attempts and the interval between retries. In Azure Data Factory, configuring a retry policy for your pipeline can be done in a couple of ways, depending on your use case:
      1. Tumbling Window Trigger: If you’re using a tumbling window trigger, retry capabilities are built into Data Factory. You can set the retry policy directly in the trigger’s configuration: Head over to your trigger’s configuration page. Set the Retry policy: count (number of retries) and Retry policy: interval in seconds properties according to your requirements. This trigger will automatically retry failed pipeline runs due to concurrency, server limits, or throttling issues (status codes 400: User Error, 429: Too many requests, and 500: Internal Server error).
      2. Activities Retry Logic:
      For individual activities within a pipeline, you can define retry behavior using activity policies, In the activities section of your pipeline, you’ll find two main types of activities: Execution Activities and Control Activities. Activity policies affect the runtime behavior of execution activities. Here are the relevant properties for retry configuration: timeout: Specifies the timeout for the activity to run (default is 7 days). retry: Maximum retry attempts (default is 0, meaning no retries). retryIntervalInSeconds: The delay between retry attempts in seconds (default is 30 seconds). These settings allow you to control how activities handle retries. Remember that the retry logic depends on the type of trigger you’re using and whether you want to apply it at the pipeline level or for specific activities. Choose the approach that best fits your scenario!
    2. Ensure that Execute Pipeline3 activity is set to complete before proceeding to Execute Pipeline4. This means that even if it fails, it won't trigger the subsequent activities until it completes successfully or exhausts its retry attempts.
      1. In the same Execute Pipeline3 activity settings: Ensure that the "Completion" condition is set properly. It should be set to "All" or to the specific conditions that indicate the activity's success. This ensures that the subsequent activities won't trigger until Execute Pipeline3 successfully completes or exhausts its retry attempts.
    3. Implement error handling within Execute Pipeline3 activity to catch and handle specific errors if needed. You can define actions to take based on the type of error encountered.
      1. Still within the Execute Pipeline3 activity settings: Configure error handling options such as logging errors, retrying on specific error types, or taking specific actions based on the error encountered. Save the changes.

    By following these steps above, you'll have configured retry policies, completion conditions, error handling, and rerun settings to handle failures and automate the recovery process in Azure Data Factory. Also, utilize the additional resources by the right side of this page.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    Please remember to "Accept Answer" if answer helped, so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam

    1 person found this answer helpful.
    0 comments No comments

  2. Pinaki Ghatak 2,400 Reputation points Microsoft Employee
    2024-05-01T08:26:03.9233333+00:00

    Hello @Anuganti Suresh

    If you need to automatically rerun a failed pipeline from where it got failed, you can use the Rerun from failed activity option in the ADF UI monitoring view or programmatically.

    This will start the pipeline from the failed activity and skip the activities that have already completed successfully.

    However, in your case, it seems that the second activity (Execute Pipeline1) is running again even though it has already completed successfully in the first run. This could be because the pipeline is being triggered again from the beginning instead of being triggered from the failed activity. To avoid this, you can try setting the Concurrency control property of the pipeline to First in, first out (FIFO).

    This will ensure that the pipeline runs in the order in which the activities were added to the pipeline. Also, you can try setting the "Activity level retry" property of the Execute Pipeline1 activity to a lower value so that it retries only a few times before moving on to the next activity.

    This will help in reducing the overall execution time of the pipeline.

    I hope this information helps you in your journey.

    1 person found this answer helpful.
    0 comments No comments