How to get data from Azure Data lake to on premises SQL DB incrementally?

Palash Aich 21 Reputation points
2021-03-04T13:41:24.867+00:00

Hello there, I need to pull data from AZURE data lake to on premises SQL DB. I need to look for option to pull incremental data based on a date field in AZURE data lake. What is the best way we can pull data from data lake? Can we use SSIS? If SSIS, what configuration and connectivity we need please? Thanks, Palash

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,348 questions
SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
2,456 questions
0 comments No comments
{count} votes

4 answers

Sort by: Most helpful
  1. Monalv-MSFT 5,896 Reputation points
    2021-03-05T05:46:53.06+00:00

    Hi @Palash Aich ,

    A SQL Server Integration Services (SSIS) package can use the Azure Data Lake Store Connection Manager to connect to an Azure Data Lake Storage Gen1 account with one of the two following authentication types.

    Please refer to the following links:
    1.Azure Data Lake Store Connection Manager
    2.Azure Feature Pack for Integration Services (SSIS)

    Best regards,
    Mona

    ----------

    If the answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

  2. Vaibhav Chaudhari 38,606 Reputation points
    2021-03-04T15:15:19.737+00:00

    One option you can think about -

    In SQL Server - Create external table that points to your data in Azure data lake and run stored procedure to push the delta data to final table

    From the document - https://learn.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-ver15#polybase-uses

    Import data from Azure Blob Storage, or Azure Data Lake Store. Leverage the speed of Microsoft SQL's columnstore technology and analysis capabilities by importing data from Hadoop, Azure Blob Storage, or Azure Data Lake Store into relational tables. There is no need for a separate ETL or import tool.

    Create external table details - https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql?view=sql-server-ver15&tabs=dedicated

    ----------

    Please don't forget to Accept Answer and Up-vote if the response helped -- Vaibhav


  3. Palash Aich 21 Reputation points
    2021-03-30T13:59:54.243+00:00

    Hi anonymous userChaudhari @Saurabh Sharma @Monalv-MSFT

    Can you please help? We need to decide if the PolyBase will be a good option to connect from SQL Server On Prem to ADLS Gen2 CDM?

    Or will the ADF will be better option to pull data from ADLS Gen2 CDM to SQL Server On Prem, however, yet to get an option for incremental data pulling using ADF from ADLS Gen 2 CDM.

    0 comments No comments

  4. jiax 1 Reputation point
    2022-01-11T10:49:26.027+00:00

    Hi,

    Have you been able to find a solution to read with ADLS Gen2 CMD polybase to import into SQL on Prem?

    How is an example of ADLS Gen2 connection.

    Thanks!!

    0 comments No comments