How to specify table distribution for auto create table

Lavanya Bhajanthri 200 Reputation points
2023-11-05T18:37:57.0833333+00:00

I am configuring a copy activity pipeline within Azure Data Factory to transfer data from a Parquet file in a blob storage to a table in Azure Synapse Analytics (formerly known as ADW). The destination table in Synapse is determined dynamically by a parameter in the pipeline, resulting in a new table with a name like ‘[schema].[some_prefix_(pipeline_string_parameter)]’. Since this table doesn’t exist beforehand, I’m utilizing the ‘auto create table’ option.

However, I also need to specify particular distribution and persistence settings. In my case, I aim to define a hash-distributed heap table, which differs from the default configurations of CCI and round-robin. How can I set these custom configurations when using the ‘auto create table’ feature?”

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
0 comments No comments
{count} votes

Accepted answer
  1. Nandan Hegde 36,151 Reputation points MVP Volunteer Moderator
    2023-11-06T03:53:04.8566667+00:00

    Unfortunately,

    Auto create table option in copy activity sink only gives default distribution i,e. ROUND_ROBIN.

    Alternatives :

    1. If your sink table schema is different every time, then after copy activity, create a duplicate table for your table with your desired distribution using CTAS. After creating duplicate table, drop your original Auto table and rename it as original table
    2. If your sink table schema is same for every time and if you know the schema, then create table before copying the data. Give your desired distributions in the create table query. For executing the query before copy, you can use Script activity before copy activity (or) pre-copy script in the copy activity sink.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.