Hi
we have a table in Synapse dedicated pool like
CREATE TABLE [dbo].[tlbInvoiceHeader]
(
[DimInvoiceID] [int] IDENTITY(1,1) NOT NULL,
[InvoiceID] [int] NOT NULL,
[InvoiceNumber] varchar NOT NULL,
[InvoiceDate] datetime2 NULL,
.
.
.
CONSTRAINT [tlbInvoiceHeader_PK] PRIMARY KEY NONCLUSTERED
(
[DimInvoiceID] ASC
) NOT ENFORCED
)
WITH
(
DISTRIBUTION = HASH ( [InvoiceNumber] ),
CLUSTERED COLUMNSTORE INDEX
)
GO
We have had a stored procedure (SP) that inserted data into the table, The developer had used "SET IDENTITY_INSERT" ON and OFF, why? please don't ask me, I know what you are thinking, anyways because of that we have duplicated records in the identity field ([DimInvoiceID]), and I have removed the "SET IDENTITY_INSERT" from the SP
To find and check for the duplicate I used
select [DimInvoiceID], count()
from [dbo].[tlbInvoiceHeader]
group by [DimInvoiceHeaderID]
having count() > 1
order by [DimInvoiceID] desc
the results are empty ? but we know that we have over 100 duplicated records, why doesn't COUNT work? this is a very very simple GROUP BY and COUNT statement
So I to wanted to make sure that I haven't missed anything I used CTE and it worked and showed me the duplicated records
the CTE was something like .....
;WITH CTE
AS (
SELECT [DuplicateCount] = ROW_NUMBER() OVER(PARTITION BY [Field1KeyRelated] ORDER BY [Field1KeyRelated])
, *
FROM [dbo].[tblXYZ]
)
SELECT *
FROM CTE
WHERE [DuplicateCount] > 1
;
it worked, which is fine
Question why is that the COUNT in a simple select statement does not work/display duplicated records, why and how?
Please help me to understand