Can we configure PolyBase to get data in SQL Server On Prem from Azure Data Lake Gen 2 Common Data Model (CDM)

Palash Aich 21 Reputation points
2021-03-30T06:08:36.123+00:00

Hello there, I need to get the data from Azure Data Lake Gen 2 CDM to SQL Server On Prem. It seems we can configure PolyBase and avoid any ETL. Can we configure PolyBase to get data in SQL Server On Prem from Common Data Model in Azure Data Lake Storage Gen 2. If yes, what are the steps. Thanks, Palash

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,338 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,532 questions
{count} votes

2 answers

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,021 Reputation points
    2021-03-31T17:39:06.63+00:00

    Hello @Palash Aich and welcome to Microsoft Q&A.

    The main difficulty in your ask, is that CDM runs in Dataflow, but Dataflow does not work on-premise. Copy Activity does work with on-prem.

    This leads me to recommend you do this in two steps. First using Dataflow, copy from the CDM to a staging location (such as delimited text in Data Lake). Then use Copy Activity to take from staging, and use Polybase to load into your on-prem SQL.

    If you know exactly where the files are in the Data Lake, it may be possible to use a Copy activity directly without using CDM. I'm not very good at CDM, so this is just a 'maybe'.


  2. MartinJaffer-MSFT 26,021 Reputation points
    2021-04-02T16:41:24.993+00:00

    Okay, so for going direct from on-prem SQL to azure storage, I found a relevant document, PalashAich-7056 .

    This article talks about blob storage. Data Lake Gen 2 is built on top of blob storage, but is not the same thing. I think there is enough compatibility to use the blob protocol against ADLS Gen2 for reading. Writing, I have less confidence on.

    Upon further searching, I found documentation stating ADLS Gen2 is not supported.