running into error "Operation on target Get Metadata1 failed: The length of execution output is over limit (around 4MB currently)" in ADF

Tinashe Chinyati 221 Reputation points
2020-12-06T11:06:50.627+00:00

Greetings
I am new to data factory and need some assistance. I am copying files from an on-prem storage to ADLSgen2. I have created a copy pipeline which has the following sequence:

  1. Get metadata activity (getting the childItems)
  2. Foreach activity
  3. Inside foreach I linked set variable activity (which is splitting and extracting the date) and copy activity

The files have this format D_OGY_20200916_094812_00.CSV. I created a dummy data folder which has about 4000 files for different dates. I can easily copy from source to destination using the following pipeline. The problem however is when I change and point the dataset to the production storage where there are more than 4000 files (files keep coming in every min) approx. more than 120k files, i run into this error message: "Operation on target Get Metadata1 failed: The length of execution output is over limit (around 4MB currently)". My pipeline is working according to expectation as i want to copy and sink using the date time45328-get-metadata.png45417-set-variable.png45418-if-condition.png45419-source-dataset.png45448-sink-dataset.png stamp in my filename. Is there a way to work around this error? Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,685 questions
0 comments No comments
{count} votes

Accepted answer
  1. HarithaMaddi-MSFT 10,136 Reputation points
    2020-12-08T07:51:54.003+00:00

    Hi @Tinashe Chinyati ,

    Welcome to Microsoft Q&A Platform. Thanks for posting the query.

    Yes, today it is a limitation with Get Metadata activity that it cannot return results > 4MB. One possible approach is to modify source files to load into separate folders each with < 5000 files and they can be accessed by Get Metadata activities separately. Since this requires change from source, another possible approach is to use "Azure Functions" to get the list of files information and then pass it to Foreach or entirely azure function can be used to implement entire requirement.

    The maximum size of returned metadata is around 4 MB.

    Ref: storage-blobs-list,process-blob-files-automatically-using-an-azure-function-with-blob-trigger

    I would recommend to submit an idea in feedback forum to remove the limitation which is closely monitored by data factory product team and will be considered for future releases.

    Please let us know for further queries and we will be glad to assist.

    ------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.