How does tempDB fit into the architecture?

JenP 50 Reputation points
2023-10-09T17:00:35.2433333+00:00

So tempDB is a shared resources set as 399GB per 100DWU. Does that mean that any of the pool's operation that need tempDB will write to a single tempDB somewhere, or that the Control + all Compute nodes have their own tempDB where their specific operations just dynamically request a piece of that 399GB pie?

I've really tried to puzzle out how this should work, and it sure seems like it'd just be one big tempDB that the nodes use as needed; that way, any data movement would be from a specific node to/from tempDB. However, the diagram of the architecture that everyone uses indicates that the DMS is something between the nodes themselves, so maybe it actually is smaller tempDBs?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,696 questions
0 comments No comments
{count} vote

2 answers

Sort by: Most helpful
  1. AnnuKumari-MSFT 32,161 Reputation points Microsoft Employee
    2023-10-12T08:05:45.41+00:00

    Hi JenP ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    As per my understanding, you are investigating about how tempDB works in synapse SQL DW and how it fits into the architecture of synapse.

    In Synapse SQL DW, there is a distributed architecture that supports MPP (Massively Parallel Processing) which uses distributed TempDB. In SQL DW, each compute node has its own local storage, including a local version of tempDB. These local tempDBs are not shared across nodes. Each node uses its local tempDB for temporary storage during query execution.

    When a query is executed in a SQL pool, the compute nodes work together to process the query. Each compute node has its own local storage, but tempDB is a shared resource that is used by all compute nodes. This means that any operation that requires tempDB will write to the same tempDB, regardless of which compute node is executing the operation.

    The Distributed Management Service (DMS) is responsible for coordinating the activities of the compute nodes in a SQL pool. The DMS manages the distribution of data across the compute nodes and ensures that each node has access to the data it needs to execute queries. However, the DMS does not manage tempDB directly. Instead, tempDB is managed by the SQL Server engine running on each compute node.

    In summary, tempDB is a shared resource that is used by all compute nodes in a SQL pool. Each compute node has its own local storage, but tempDB is a shared resource that is used by all nodes. The DMS is responsible for coordinating the activities of the compute nodes, but does not manage tempDB directly.

    The scope of tempory tables is within the session. When a new session is created, no temporary tables would be automatically deleted. So, the space would be freed up in every new session .

    If you are using dedicated sql pool, max size for tempdb is 399 GB per DW100c.

    In case of serverless sql pool, The number of temporary tables is limited to 100, and their total size is limited to 100 MB.

    Hope it helps. Thankyou


  2. JenP 50 Reputation points
    2023-10-18T16:13:40.22+00:00

    @AnnuKumari-MSFT Here's an example of why I'm confused about Synapse just having 1 shared tempDB.

    In the sample below, you can see that during data movement in one of our queries, it says that the Compute nodes are creating a temp table in tempDB and the distribution_type is AllComputeNodes (example: step 5). Then the Compute nodes broadcast a query from All Distributions into that temp table just created (example: step 6)

    Unless I'm reading that wrong, it looks like the temp table is being created in the SQL Server engine running on each compute node, as opposed to some shared tempDB somewhere.

    (As indicated in the documentation for sys.dm_pdw_request_steps, I'm using the Analytics Platform System (PDW) product documentation to understand the various operation_types)

    User's image