SAP Table to Azure Data Lake - Partition Option

Satwik Kashyap 11 Reputation points
2022-03-18T23:25:06.747+00:00

Scenario:

  1. Copy data from SAP table to Azure Data Lake
  2. Leveraging 'Partition option' in Copy data activity as without it we are getting memory error in SAP when pulling data

Questions:

=> How to load bulk data if I do not have a partition Key in an SAP Table?

=> How are 'Partition upper bound' and 'Partition lower bound' used to partition the files internally? How the number of partitions is decided internally?

eg. If I have 2 years of data in a table and it has a column storing date as 'YYYYMM'. Whenever I partition 'on Calendar Month' with the same column, for these 2 years of data, It will always create 24 partitions (for each month)? Is my understanding correct?

=> Can azure load the data in batches from SAP Tables (with a number of rows)?

=> Can Copy data automatically detect the key to partition on for SAP table as a source?

Any help on this will be appreciated. I have read the documentation but not a lot is available.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,467 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,753 questions
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,126 Reputation points Microsoft Employee
    2022-03-22T15:36:06.627+00:00

    Hi @Satwik Kashyap ,

    Thank you for posting query in Microsoft Q&A Platform.

    If your SAP table has a large volume of data, such as several billion rows, use partitionOption and partitionSetting to split the data into smaller partitions. In this case, the data is read per partition, and each data partition is retrieved from your SAP server via a single RFC call.

    Taking partitionOption as partitionOnInt as an example, the number of rows in each partition is calculated with this formula: (total rows falling between partitionUpperBound and partitionLowerBound)/maxPartitionsNumber.