COUNT function not working in Azure Synapse

Nik - Shahriar Nikkhah 21 Reputation points
2022-01-19T00:14:39.987+00:00

Hi

we have a table in Synapse dedicated pool like

CREATE TABLE [dbo].[tlbInvoiceHeader]
(
[DimInvoiceID] [int] IDENTITY(1,1) NOT NULL,
[InvoiceID] [int] NOT NULL,
[InvoiceNumber] varchar NOT NULL,
[InvoiceDate] datetime2 NULL,
.
.
.
CONSTRAINT [tlbInvoiceHeader_PK] PRIMARY KEY NONCLUSTERED
(
[DimInvoiceID] ASC
) NOT ENFORCED
)
WITH
(
DISTRIBUTION = HASH ( [InvoiceNumber] ),
CLUSTERED COLUMNSTORE INDEX
)
GO

We have had a stored procedure (SP) that inserted data into the table, The developer had used "SET IDENTITY_INSERT" ON and OFF, why? please don't ask me, I know what you are thinking, anyways because of that we have duplicated records in the identity field ([DimInvoiceID]), and I have removed the "SET IDENTITY_INSERT" from the SP

To find and check for the duplicate I used
select [DimInvoiceID], count()
from [dbo].[tlbInvoiceHeader]
group by [DimInvoiceHeaderID]
having count(
) > 1
order by [DimInvoiceID] desc

the results are empty ? but we know that we have over 100 duplicated records, why doesn't COUNT work? this is a very very simple GROUP BY and COUNT statement

So I to wanted to make sure that I haven't missed anything I used CTE and it worked and showed me the duplicated records

the CTE was something like .....
;WITH CTE
AS (
SELECT [DuplicateCount] = ROW_NUMBER() OVER(PARTITION BY [Field1KeyRelated] ORDER BY [Field1KeyRelated])
, *
FROM [dbo].[tblXYZ]
)
SELECT *
FROM CTE
WHERE [DuplicateCount] > 1
;

it worked, which is fine

Question why is that the COUNT in a simple select statement does not work/display duplicated records, why and how?

Please help me to understand

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Saurabh Sharma 23,846 Reputation points Microsoft Employee Moderator
    2022-01-26T01:01:08.703+00:00

    Hi @Nik - Shahriar Nikkhah ,
    As per internal discussion in your case what’s skewing the operation is the unique constraint "not enforced" declaration in the schema. Since there is really no constraints that can be applied in you schema, you can remove it completely and just declare the necessary indexes for performance requirements. You can try to experiment that and see the results.
    Please check Primary, foreign, and unique keys - Azure Synapse Analytics documentation for details.

    Please let me know if you have any other questions.

    Thanks
    Saurabh


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.