Azure data factory with performance problem because of the manage private endpoints

Joao Ferreira 0 Reputation points
2023-07-12T10:42:56.6+00:00

Hello,

I have an architecture using the storage account as a file source(input), a data factory to process the files and CosmosDB to receive the file that was proceeded.

I also have an Azure pipeline to run all the data factory pipelines that I have, in a public network the execution takes 1 hour but in a private network using manage private endpoints, the execution takes 5 hours or more.

I want to know what is the reason and if it is possible to mitigate this problem

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Bhargava-MSFT 29,266 Reputation points Microsoft Employee
    2023-07-12T22:34:24.3233333+00:00

    Hello Joao Ferreira,

    Welcome to the Microsoft Q&A platform.

    One reason could be that your private network may have higher Network latency than the public network.

    and if you are using copy activity:

    <copied from the documentation page>
    By default, every copy activity spins up a new compute based upon the configuration in copy activity. With managed virtual network enabled, cold computes start-up time takes a few minutes and data movement can't start until it's complete. If your pipelines contain multiple sequential copy activities or you have many copy activities in foreach loop and can’t run them all in parallel, you can enable a time to live (TTL) value in the Azure integration runtime configuration. Specifying a time to live value and DIU numbers required for the copy activity keeps the corresponding computes alive for a certain period of time after its execution completes. If a new copy activity starts during the TTL time, it will reuse the existing computes, and start-up time will be greatly reduced. After the second copy activity completes, the computes will again stay alive for the TTL time.

    Please see the below document for more details.

    https://learn.microsoft.com/en-us/azure/data-factory/managed-virtual-network-private-endpoint

    I hope this helps. Please let me know if you have any further questions.

    0 comments No comments