Azure Data Factory automatic pagination for results with more than 5K records

Roberto Castano 0 Reputation points
2023-11-09T13:42:52.66+00:00

Hi,

I'm working on a project using ADF as ETL layer to copy data from Dataverse to a Azure SQL DB. I was reading about the limitations of Dataverse connector (80 MB and 5K records for a single query), however, performing some tests, I noticed that using a Dataverse connection and querying with FetchXML, a Dataflow can retrieve more than 5K records without any additional logic.

Is this intended to be as design in ADF? Are the above limitations not valid for ADF?

Thanks,

Roberto

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,028 questions
{count} votes

1 answer

Sort by: Most helpful
  1. phemanth 12,400 Reputation points Microsoft Vendor
    2023-11-10T12:16:38.52+00:00

    @Roberto Castano

    Thanks for using MS Q&A

    The limitations you mentioned (80 MB and 5K records for a single query) are generally applicable to the Dataverse connector. However, when it comes to FetchXML queries in a Dataflow, it’s possible that these limitations might not apply in the same way.

    As you mentioned, you can use FetchXML to retrieve more than 5K records without additional logic. This is because ADF uses paging to retrieve data from Dataverse, which means that it retrieves data in batches of 5K records. When you use FetchXML, ADF automatically handles the paging for you, so you can retrieve more than 5K records without any additional logic.

    It’s important to note that the behavior you’re observing could be due to specific implementation details of the ADF Dataverse connector or the way ADF handles FetchXML queries in Dataflows. 

    reference: https://learn.microsoft.com/en-us/azure/data-factory/connector-dynamics-crm-office-365?tabs=data-factory

     I hope this helps! please do Let us know if you have any further questions.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.