SQL Server How to find columns names which has unique value

Question

SQL Server How to find columns names which has unique value

T.Zacks 3,996

Suppose a table has been given to me to create index. below is a table example where i know on which field i should create composite index. in the below table i know those fields name by which we can identify each row uniquely. those fields are
Ticker, ClientCode, Earnings, PrePost

CREATE TABLE [dbo].[tblLastCSMDelivered](
 [ID] [int] IDENTITY(1,1) NOT NULL,
 [Ticker] [varchar](10) NULL,
 [ClientCode] [varchar](10) NULL,
 [EarningTickerID] [int] NULL,
 [AllDateMapName] [varchar](max) NULL,
 [Earnings] [varchar](10) NULL,
 [PrePost] [varchar](10) NULL,
 [QC-ViewAllContent] [xml] NULL,
 [QCCommentsContent] [xml] NULL,
 [LastCSMDeliveredDate] [datetime] NULL,
 [InsertedOn] [datetime] NULL,
 [Action] [char](1) NULL,
 [UserName] [varchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

i have given a table which has many columns where i need to create a non-cluster composite index. so first of all i have to know those fields name in that table which uniquely identify each row.

i got these two url looks bit similar
https://stackoverflow.com/questions/26199765/sql-query-to-determine-that-values-in-a-column-are-unique
https://stackoverflow.com/questions/6941049/how-to-find-out-whether-a-table-has-some-unique-columns

so please tell me what sql i should issue to know those fields name where i have to create a non-cluster composite index. please guide me. thanks

Accepted answer

2 additional answers

Your answer

Answer 1

Hi @T.Zacks ,

Please check below T-SQL;

DECLARE @table varchar(100), @sql varchar(max);  
SET @table = 'some table name';  
  
SELECT  
  @sql = COALESCE(@sql + ', ', '') + ColumnExpression  
FROM (  
  SELECT  
    ColumnExpression =  
      'CASE COUNT(DISTINCT ' + COLUMN_NAME + ') ' +  
      'WHEN COUNT(*) THEN ''UNIQUE'' ' +  
      'ELSE '''' ' +  
      'END AS ' + COLUMN_NAME  
  FROM INFORMATION_SCHEMA.COLUMNS  
  WHERE TABLE_NAME = @table  
) s  
  
SET @sql = 'SELECT ' + @sql + ' FROM ' + @table;  
PRINT @sql;  /* in case you want to have a look at the resulting query */  
EXEC(@sql);

It simply compares COUNT(DISTINCT column) with COUNT(*) for every column. The result will be a table with a single row, where every column will contain the value UNIQUE for those columns that do not have duplicates, and empty string if duplicates are present.
But the above solution will work correctly only for those columns that do not have NULLs. It should be noted that SQL Server does not ignore NULLs when you want to create a unique constraint/index on a column. If a column contains just one NULL and all other values are unique, you can still create a unique constraint on the column (you cannot make it a primary key, though, which requires both uniquness of values and absence of NULLs).

Therefore you might need a more thorough analysis of the contents, which you could get with the following script:

DECLARE @table varchar(100), @sql varchar(max);  
SET @table = 'some table name';  
  
SELECT  
  @sql = COALESCE(@sql + ', ', '') + ColumnExpression  
FROM (  
  SELECT  
    ColumnExpression =  
      'CASE COUNT(DISTINCT ' + COLUMN_NAME + ') ' +  
      'WHEN COUNT(*) THEN ''UNIQUE'' ' +  
      'WHEN COUNT(*) - 1 THEN ' +  
        'CASE COUNT(DISTINCT ' + COLUMN_NAME + ') ' +  
        'WHEN COUNT(' + COLUMN_NAME + ') THEN ''UNIQUE WITH SINGLE NULL'' ' +  
        'ELSE '''' ' +  
        'END ' +  
      'WHEN COUNT(' + COLUMN_NAME + ') THEN ''UNIQUE with NULLs'' ' +  
      'ELSE '''' ' +  
      'END AS ' + COLUMN_NAME  
  FROM INFORMATION_SCHEMA.COLUMNS  
  WHERE TABLE_NAME = @table  
) s  
  
SET @sql = 'SELECT ' + @sql + ' FROM ' + @table;  
PRINT @sql;  /* in case you still want to have a look at the resulting query */  
EXEC(@sql);

This solution takes NULLs into account by checking three values: COUNT(DISTINCT column), COUNT(column) and COUNT(*). It displays the results similarly to the former solution, but the possible diagnoses for the columns are more diverse:

•UNIQUE--- means no duplicate values and no NULLs (can either be a PK or have a unique constraint/index);
•UNIQUE WITH SINGLE NULL --- as can be guessed, no duplicates, but there's one NULL (cannot be a PK, but can have a unique constraint/index);
•UNIQUE with NULLs ---no duplicates, two or more NULLs ;
•empty string --- there are duplicates, possibly NULLs too.

Refer to this similar thread.

If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

T.Zacks 3,996 Reputation points

2022-02-01T07:48:52.65+00:00

@CathyJi-MSFT Thank you for your help.
T.Zacks 3,996 Reputation points

2022-02-01T17:29:55.02+00:00

Sir i have run your second set of script which return below error

when this line execute then it is giving error
Msg 102, Level 15, State 1, Line 1
Incorrect syntax near '-'.

this is my table structure for which i ran your script.

CREATE TABLE [dbo].[tblLastCSMDelivered]( [ID] [int] IDENTITY(1,1) NOT NULL, [Ticker] [varchar](10) NULL, [ClientCode] [varchar](10) NULL, [EarningTickerID] [int] NULL, [AllDateMapName] [varchar](max) NULL, [Earnings] [varchar](10) NULL, [PrePost] [varchar](10) NULL, [QC-ViewAllContent] [xml] NULL, [QCCommentsContent] [xml] NULL, [LastCSMDeliveredDate] [datetime] NULL, [InsertedOn] [datetime] NULL, [Action] [char](1) NULL, [UserName] [varchar](max) NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO

please test your second script with my table given above then you can see the error.

please have a look at your 2nd script and tell me why i am getting this error when executing your second set of script. thanks
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2022-02-01T22:05:09.447+00:00

Sir i have run your second set of script which return below error.

Sir? I can't speak for Cathy, but myself I don't like being called Sir, if for a different reason why Cathy may object.

Anyway, there were a couple of issues in her script:

Concatenation relies an method which has no defined correct result and may not work. Since we have string_agg since SQL 2017, why not use that.

Apply quotename around column names, since some columns have characters not permitted in identifiers.

Filter out xml columns, which can't be indexed anyway.

Make sure the query is split up over multiple lines, so it's readable.

Now the query runs. However, I doubt that you will get anything useful out of it.

Answer 2

I'm sorry the question does not make any sense. The way to get information is to ask the stakeholders about what should be unique. Even if you find that column sets (a,b,c), (a, f, g), (c, g, h, i, j k) all have unique values, how would you that this is only by chance?

If you want to run queries, you can do:

SELECT a, b, c, COUNT(*)
FROM   tbl
GROUP BY a,b,c
HAVING COUNT(*) > 1

If that query return rows, it is not a candidate. Now, say that there are 20 columns, not count LOB columns that cannot be in an index anyway. That's 2^20-1 queries. And if the table is of any size...

You see where this is going. Into a dead end.

Answer 3

Tom Phillips 17,771

Indexes should be created to support queries. You should not "create a non-cluster composite index" when you don't know what fields should be in the index.

Share via

SQL Server How to find columns names which has unique value

2 additional answers

Your answer