How to delete huge files in the ADLS Gen 1 folder using Data Factory

M R, Saravanan SOMWIPRO-SOMWIPRO 5 Reputation points
2023-03-31T07:31:56.6933333+00:00

I'm trying to delete files in the ADLS folder using data factory delete activity. The pipeline fail with the following error below.

Failed to execute delete activity with data source 'AzureDataLakeStore' and error 'The request to 'Azure Data Lake Store' failed and the status code is 'BadRequest', request id is '858ea702-8d53-4e65-83ac-4cda95bca10b'. {"exception":"IllegalArgumentException","message":"Directory too large, Exceeded max enumerate entries 1000000. Please use pagination to list contents of this directory. [858ea702-8d53-4e65-83ac-4cda95bca10b][2023-03-31T00:14:28.5128250-07:00]","javaClassName":"java.lang.IllegalArgumentException"}} The remote server returned an error: (400) Bad Request.'. For details, please reference log file here:

Please recommend a solution for the issue.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,426 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
{count} vote

2 answers

Sort by: Most helpful
  1. Suba Balaji 11,206 Reputation points
    2023-04-01T11:14:54.2633333+00:00

    Hi @M R, Saravanan SOMWIPRO-SOMWIPRO

    You could try using AzCopy Remove command from azure portal, in Azure CLI.

    Please check the syntax and usage here:

    https://learn.microsoft.com/en-us/azure/storage/common/storage-ref-azcopy-remove

    If that seems to work, you can call the same command using a batch activity from ADF.

    kindly write back if that helped or you need further clarification on this.

    BR,

    Suba


  2. AnnuKumari-MSFT 32,161 Reputation points Microsoft Employee
    2023-04-05T07:41:27.3633333+00:00

    @M R, Saravanan SOMWIPRO-SOMWIPRO , Thankyou for using Microsoft Q&A platform and thanks for posting your question here. In addition to the above suggestion by community, you can try the approach suggested in the below post to use python sdk in order to delete the files in ADLS: https://stackoverflow.com/questions/63475269/how-do-you-delete-a-file-from-an-azure-data-lake-using-the-python-sdk Kindly check and revert back by accepting the answer if it's helpful.