Transferring a large payload to an api and reading the response.

Hadi Chahine 0 Reputation points
2023-10-27T07:18:24.1166667+00:00

I have a file (>150 MB) to transfer to an API (json lines). The API returns an "id" to track the processing of this file. I can't do this with a "Lookup" and a "Web activity" because "Lookup"s cannot load such a large payload. "Copy Data" doesn't work too because It doesn't output the response from the api (I need to access the "id" returned). Is there any way of getting this done purely in ADF?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,646 questions
{count} votes

2 answers

Sort by: Most helpful
  1. QuantumCache 20,366 Reputation points Moderator
    2023-10-27T17:12:27.4033333+00:00

    @Hadi Chahine Welcome to QnA forum.

    1. Do you want to post the entire 150MB file to the API at once? or is it possible to post the payload to the API in Chunks?
    2. What is the size of the response from the API?
    3. Did you try outside of the ADF, such as Postman , traditional Web Apps to POST the Payload to the API endpoint?
    4. What is your motivation or requirement to use only ADF in this scenario?

    Web activity in Azure Data Factory and Azure Synapse Analytics

    User's image

    Limitations and workarounds
    Here are some limitations of the Lookup activity and suggested workarounds.
    User's image

    Data Factory limits
    Some Related info: Payload is too large
    ***Error message: *The payload including configurations on activity/dataSet/linked service is too large. Please check if you have settings with very large value and try to reduce its size.

    Cause: The payload for each activity run includes the activity configuration, the associated dataset(s), and linked service(s) configurations if any, and a small portion of system properties generated per activity type. The limit of such payload size is 896 KB as mentioned in the Azure limits documentation for Data Factory and Azure Synapse Analytics.

    Recommendation: You hit this limit likely because you pass in one or more large parameter values from either upstream activity output or external, especially if you pass actual data across activities in control flow. Check if you can reduce the size of large parameter values, or tune your pipeline logic to avoid passing such values across activities and handle it inside the activity instead.

    0 comments No comments

  2. Hadi Chahine 0 Reputation points
    2023-10-30T07:54:12.02+00:00

    Hi Satish,

    Thanks for taking the time to reply to my question. The API I'm calling has to be called at once (no chunking). The response is very small (has an id). The call is being done now inside a function app, but I would like to get rid of any code outside of ADF for deployment reasons, and I want to consolidate the process within ADF as much as possible. The function app we have now gets the data from blob storage and PUTs inside the api and returns the response back to the pipeline. I'd like something that's completely within ADF.

    Thanks. Let me know if that's possible. It sounds to me like a "CopyData" should do that, but the problem with "CopyData" is that it cannot return the response of the sink.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.