Can we apply partitioning and distributions simultaneously to a big data table?

Bose, Dibyendu 1 Reputation point
2022-06-15T04:42:03.617+00:00

I need some clarity on the concepts of partitions and distributions. We have a fact table 'SalesInvoice' in a Synapse dedicated pool consisting of more than 100 million records. The table is hash distributed on the 'RegionID' column. Can we apply further partitioning to improve query performance? If yes, how can we do that?

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,379 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,551 Reputation points Microsoft Employee Moderator
    2022-06-16T09:30:32.063+00:00

    Hi @Anonymous ,

    Thank you for posting query in Microsoft Q&A Platform.

    Distribution is something concept of distributing your table data to different nodes(small DBs). It something like hardware concept. Where data is getting distributed to different machines(nodes). In Data warehouses you have MPP architecture that allows us to distribute table data. Please check below link to understand MPP architecture.
    https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/overview-architecture

    Below video will also helps for Azure Synapse Architecture.
    Azure Synapse SQL Architecture

    This video further helps to understand about distributions. Click here for documentation of distributions in Azure Synapse.

    Now, coming to partition, its like divide your data in to smaller groups of data. You can think it as of storage/software concept. Partitioning is supported on all dedicated SQL pool table types; including clustered columnstore, clustered index, and heap. Partitioning is also supported on all distribution types, including both hash or round robin distributed.
    Click here to know about partitions in Azure Synapse.

    Below SO post also has great discussion about Distribution and partition. Kindly have a look.
    https://stackoverflow.com/questions/51677471/what-is-a-difference-between-table-distribution-and-table-partition-in-sql

    Hope this helps. Please let us know if any further queries.

    ----------

    Please consider hitting Accept Answer. Accepted answers help community as well.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.