Copy data activity from synapse serverless sql pool to ADLS gen 2

rajanisqldev-42 206 Reputation points
2024-03-06T17:28:48.3066667+00:00

Hi,

I have synapse serverless sql pool configured to ADLS Gen 2 account where the container is filled with files from D365 using synapse link.

I need to build datawarehouse in Azure.

We have very tight restrictions on budget and don't want to go Dedicated pool route. Instead we are interested in serverless and kdatalake house.

I need to first copy serverless sql entity into ADLS as parquet files for better performance.

I have created a pipeline with just COPY Data activity to do this. I have configured to do the job for all entities in synapse with single copy data activity. It is working.

But, whenever I run the pipeline, the files are creating everytime without overwrite. This is will cause duplication of the data in each file. How can I tell copy data's Sink to overwrite before copy?

Thanks in advance

ADF_1.png

ADF_2.png

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,343 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,372 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,550 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 30,751 Reputation points Microsoft Employee
    2024-03-07T17:48:07.77+00:00

    Hi rajanisqldev-42 ,

    Could you please share sink dataset screenshot.

    It seems like you might not have provided any filename to the output file and it might be auto-generating the filename and creates a new file each time it runs the pipeline .

    If you provide a constant name to the file , it will overwrite the file in the next run as same file name would be already present in the folder .

    Hope it helps. Let me know if you have any queries.