Synapse dedicated sqlpool partition switch: error trying to get the partition number based on a value or boundary

Question

Synapse dedicated sqlpool partition switch: error trying to get the partition number based on a value or boundary

Miguel Buenestado 41

I am trying partition switching in a dedicated SQL pool in Synapse.

Both tables, staging (temp) and destiny (final), have in common the same partition distribution, based on monthly cuts (e.g., 20190101, 20190201, 20190301 and so on).
I have already got the data in the staging table, ready for the partition to be swapped.
The staging table will vary through each data load: today you could have a month worth of data, tomorrow you may get two months of data...

Which means that the switch criteria will be dynamic.

Thus, I must identify which partitions need to be switched, that is, the partition numbers based on the boundary values given in the staging table
To do so, I was trying the following query,

SELECT DISTINCT s.[name]                        AS      [schema_name]  
,           t.[name]                        AS      [table_name]  
,           i.[name]                        AS      [index_name]  
,           p.[partition_number]            AS      [partition_number]  
--,           SUM(a.[used_pages]*8.0)         AS      [partition_size_kb]  
--,           SUM(a.[used_pages]*8.0)/1024    AS      [partition_size_mb]  
--,           SUM(a.[used_pages]*8.0)/1048576 AS      [partition_size_gb]  
,           p.[rows]                        AS      [partition_row_count]  
,           rv.[value]                      AS      [partition_boundary_value]  
,           p.[data_compression_desc]       AS      [partition_compression_desc]  
FROM sys.schemas s  
INNER JOIN  sys.tables t                    ON      t.[schema_id]         = s.[schema_id]  
INNER JOIN  sys.partitions p                ON      p.[object_id]         = t.[object_id]  
--JOIN        sys.allocation_units a          ON      a.[container_id]      = p.[partition_id]  
INNER JOIN  sys.indexes i                   ON      i.[object_id]         = p.[object_id]  
AND     i.[index_id]          = p.[index_id]  
INNER JOIN  sys.data_spaces ds              ON      ds.[data_space_id]    = i.[data_space_id]  
INNER JOIN  sys.partition_schemes ps        ON      ps.[data_space_id]    = ds.[data_space_id]  
INNER JOIN  sys.partition_functions pf      ON      pf.[function_id]      = ps.[function_id]  
INNER JOIN  sys.partition_range_values rv   ON      rv.[function_id]      = pf.[function_id]  
AND     rv.[boundary_id]      = p.[partition_number]  
  
          
WHERE       p.[index_id] <=1 AND  
t.[name] = 'STAGING_XXXX' AND  
cast(rv.[value] as nvarchar(128)) IN (  
SELECT  DISTINCT d_boundary_key AS partition_key  
FROM DIM_DATE dat  
INNER JOIN STAGING_XXXX stg  
ON dat.c_date_id = stg.c_date_id  
)  
GROUP BY    s.[name]  
,           t.[name]  
,           i.[name]  
,           p.[partition_number]  
,           p.[rows]  
,           rv.[value]  
,           p.[data_compression_desc]  
ORDER BY 1

but it appears I cannot join the SYS information with my DATA information, as I am getting the following error:

Unsupported data type error. Statement references a data type that is unsupported in Parallel Data Warehouse, or there is an expression that yields an unsupported data type. Modify the statement and re-execute it.

I tried several join methods (INNER JOIN, IN clauses) and different CAST and CONVERT functions, none of which resulted.

Can you provide a proper method to achieve this dynamic partition switching this in Synapse please?

Thank you in advance.

Answer accepted by question author

1 additional answer

Your answer

Answer 1

Hello @Miguel Buenestado ,
Thanks for the question and using MS Q&A platform.
As we understand the ask here is "Unsupported data type error. Statement references a data type that is unsupported in Parallel Data Warehouse, or there is an expression that yields an unsupported data type" . Modify the statement and re-execute it., please do let us know if its not accurate.

I am assuming that the error is on the line 26 . When I checked rv.[value] the datatype is SQL_Variant .

and from the doc here i see that we do can convert SQL_variant to varchar : https://learn.microsoft.com/en-us/sql/t-sql/data-types/sql-variant-transact-sql?view=sql-server-ver16

Can you please try to create a table ( for testing ) and perfomr a INSERT with the below logic .

SELECT DISTINCT d_boundary_key AS partition_key
FROM DIM_DATE dat
INNER JOIN STAGING_XXXX stg
ON dat.c_date_id = stg.c_date_id

and then try to perform the JOIN .

Also can you please share the schema details for DIM_DATE & STAGING_XXXX tables ?

Please do let me if you have any queries.
Thanks
Himanshu

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Answer 2

Miguel Buenestado 41

Hi @HimanshuSinha ,

Yes indeed, the problem lies with the SQLVariant rv.value field.

Your workaround seems satisfactory: I tried the INSERT you suggested (that is, a temp table just holding the partition details) and then I was able to perform the JOIN statement with this temporary table successfully.

Thanks,
Miguel

HimanshuSinha 19,547 Reputation points Microsoft Employee Moderator

2022-09-28T18:07:24.997+00:00

Hello @Miguel Buenestado ,
Great . Happy to help ;)
Request you to go ahead and "accept:" my answer .
If you want you can take the survey also .
Thanks
Himanshu

Share via

Synapse dedicated sqlpool partition switch: error trying to get the partition number based on a value or boundary

1 additional answer

Your answer