Best way to ingest data from SAP BSEG table

Srinivas, Shreya 0 Reputation points
2025-06-25T10:00:20.27+00:00

I need to extract the following columns from the BSEG table: BUKRS, BELNR, SHKZG, HKONT, DMBTR, SGTXT, PRCTR, GJAHR

I’ve tried:

  • Filtering by HKONT values
  • Partitioning by GJAHR (fiscal year)
  • Combining both filters

However, all attempts are resulting in memory errors or timeouts, likely due to the size of the BSEG table.

What is the most efficient and reliable way to extract this data from BSEG without hitting memory issues?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 24,260 Reputation points Microsoft External Staff Moderator
    2025-06-25T10:37:54.76+00:00

    Hi @Srinivas, Shreya
    Thanks for detailing your requirement and the challenges you're facing - BSEG is indeed one of the largest and most performance-sensitive tables in SAP, so your issue is a common one.

    Recommended Approach for Ingesting Data from BSEG

    To avoid memory issues and timeouts, consider the following strategies:

    Use SAP ODP (Operational Data Provisioning) via ADF SAP CDC Connector (Preferred)

    1. If you're using Azure Data Factory’s SAP CDC connector, use an ODP-enabled extractor (such as 0FI_GL_4 or a custom ODP view).
    2. These extractors are optimized for delta handling and filtering at source level.
    3. You can prefilter on HKONT and GJAHR before extraction — reducing memory load significantly.

    Pros: Handles large volumes efficiently, supports delta loads, SAP-supported

    Cons: May require ABAP extractors if not already available

    Use RFC Table-based Extraction with careful pushdown Filters

    If you're using ADF’s SAP Table connector (RFC-based), apply strict filtering and row-based paging:

    • In the query, use:
        SELECT BUKRS, BELNR, SHKZG, HKONT, DMBTR, SGTXT, PRCTR, GJAHR  FROM BSEG  
        WHERE HKONT IN ('XXXX', 'YYYY') AND GJAHR >= '2024' 
      
    • Partition by a smaller-granularity field (e.g., BELNR ranges or monthly chunks of GJAHR), rather than just GJAHR, which may still return large datasets per partition.
    • Limit each extraction chunk to a manageable size (e.g., 50,000–100,000 rows per call).

    Use Intermediate Staging in SAP (optional but scalable)

    If the volume is too high even for partitioned ADF extraction:

    Ask the SAP team to create a custom Z-table or ABAP CDS View that:

    • Includes only needed columns
    • Applies HKONT and GJAHR filters
    • Flattens/joins logic if needed
    • ADF can then extract from this smaller, filtered dataset more efficiently.

    I hope this information helps. Please do let us know if you have any further queries.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.