What is the best way to Divide up a Table for access to ADF due to Look Up Limitation

Mike Kiser 1,536 Reputation points
2021-06-04T19:42:17.897+00:00

Hello!
@Nasreen Akter @MartinJaffer-MSFT @KranthiPakala-MSFT

I have a DB Table the size of the following. We must load ALL employees however my Lookups will only take about 3000 rows (huge number of columns). What is the best way for me to break this table of 8452 rows into batches of 3000 rows so I can run my pipelines.
Thanks!
Mike

102598-image.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,640 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,491 Reputation points Microsoft Employee Moderator
    2021-06-16T16:54:36.587+00:00

    Hello @Mike Kiser ,
    Thanks for the ask and using the Microsoft Q&A platform .
    As @MarkKromer-MSFT suggested you can use ADF mapping data flow , but in case if you want to use ADF pipeline you can try to use the the waterrmark logic .

    1. Create a watermark table at the Source . Let the table have the field may be a employeeid /somedate ( depends on which field you want to use )
    2. Read the value of the watermark table using a lookup .
    3. In your pipeline add a for loop .
    4. Add a lookup activity , *** The select statement should be dynamic something on the lines 'SELECT top 5000 fieldnames where watermark = LookupActivityinStep2.value"
    5. Your lookup should have all the values for 5000 records . Now you can perform your logic here . If you intend to use an inner loop here ( which is not supported ) please use a execute pipeline and implement the FE In the child pipeline
    6. Now get the max values of the watermark field and update the watermark table .
    7. In the next iteration the dynamic expression which only pick rows greater the value which is in the watermark table .
      Hope this helps

    Please do let me know how it goes .
    Thanks
    Himanshu
    Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.