Can we restart the ADF pipeline from the point of failure of record or files
Hi We have a copy activity for which we put retry option I have 10 files in container folder and in middle copy failed after copy 4files and at 5th file Will retry option copies from file1 and re copy from beginning or resume from failure like copying from 5th file?? Note: Dataset type Binary files
Azure Data Factory
-
phemanth • 15,320 Reputation points • Microsoft External Staff
2024-01-24T10:05:24.85+00:00 Thanks for the question and using MS Q&A
Under the general settings, there are options for Retry and Retry interval in seconds. The default option for retry is 0, which means that if your copy activity fails once, it will be logged as a failure in the overall data pipeline status. As the author of the pipeline, you can set the retry option to 3 and the retry interval to 60 seconds. This setting tells Azure Data Factory (ADF) to retry the failed action again after 60 seconds if the copy activity fails. If it fails again, ADF will attempt to re-execute the same action for a total of 3 times at 60 seconds each.
If you are using this feature within a for loop and the activity fails for the first time but succeeds on the next attempt, the overall status of the for loop will be marked a success.
You can use the resume from last failed run feature of Azure Data Factory’s copy activity to continue copying files from where the last run failed. This feature is available when copying files between file-based data stores including Amazon S3, Google Cloud Storage, Azure Blob, and Azure Data Lake Storage Gen2, among others. When you retry the copy activity or manually rerun the failed activity from the pipeline, the copy activity will continue from where the last run failed. Therefore, in your case, the copy activity will resume copying from the 5th file. For more information, please refer to the following links:
- Azure Data Factory copy activity supports resume from last failed run
https://azure.microsoft.com/en-in/updates/data-factory-copy-activity-supports-resume-from-last-failed-run/ - Copy activity overview - Azure Data Factoryhttps://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overvie
Hope this helps. Do let us know if you any further queries.
- Azure Data Factory copy activity supports resume from last failed run
-
phemanth • 15,320 Reputation points • Microsoft External Staff
2024-01-25T10:33:47.43+00:00 @Rajesh Gopisetti We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
-
Rajesh Gopisetti • 5 Reputation points
2024-01-25T10:39:35.26+00:00 I just viewed the below link and found restart will resume for binary files and that too a specific scenarios will be there to restart the run from the point of failure and that is decided by internal azure itselfhttps://learn.microsoft.com/en-in/azure/data-factory/copy-activity-overview#resume-from-last-failed-run
- When you copy data from Amazon S3, Azure Blob, Azure Data Lake Storage Gen2 and Google Cloud Storage, copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths; files beyond this number will be re-copied during reruns.
For other scenarios than binary file copy, copy activity rerun starts from the beginning.
-
phemanth • 15,320 Reputation points • Microsoft External Staff
2024-01-29T04:45:20.75+00:00 @Rajesh Gopisetti did you got your issue resolved? from the above information provided by you
-
Rajesh Gopisetti • 5 Reputation points
2024-01-31T17:31:20.5933333+00:00 I didn’t able to open the first link and i could see bad request while opening it
-
Rajesh Gopisetti • 5 Reputation points
2024-01-31T17:42:14.88+00:00 - When you copy data from Amazon S3, Azure Blob, Azure Data Lake Storage Gen2 and Google Cloud Storage, copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths; files beyond this number will be re-copied during reruns.
I am currently running for SFTP SOURCE
target is adls container
my copy activity has retry option of 1 and freq 30secs.
example: My sftp has 500files and after transferring 10 files, connection lost and as we have a retry option enabled. Will my retry option starts from 11th file and copies remaining??
I see the comment
- copy activity can resume from arbitrary number of copied files. While for the rest of file-based connectors as source, currently copy activity supports resume from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths
—>Limited number of files, ranges of thousands, length of file pathscan you confirm the exact file range from where exactly resume of copy activity for sftp connectors??
-
phemanth • 15,320 Reputation points • Microsoft External Staff
2024-02-01T18:29:43.72+00:00 @Rajesh Gopisetti
When you use the retry option in a copy activity, it will resume copying from the point of failure. If the copy activity failed after copying 10 files, it will resume copying from the 11th file.Regarding the comment you mentioned, it is true that the copy activity can resume from an arbitrary number of copied files for some connectors such as Amazon S3, Azure Blob, Azure Data Lake Storage Gen2, and Google Cloud Storage. However, for the rest of the file-based connectors as source, the copy activity supports resuming from a limited number of files, usually at the range of tens of thousands and varies depending on the length of the file paths.
please go through: https://winscp.net/eng/docs/resume
https://learn.microsoft.com/en-us/azure/data-factory/connector-sftp?tabs=data-factory
please try to test this by running the copy activity with a small number of files and checking the logs to see which files were copied.
Do let us know if you have any further questions.
Sign in to comment