CHECKSUM (Transact-SQL)
Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance Azure Synapse Analytics SQL analytics endpoint in Microsoft Fabric Warehouse in Microsoft Fabric
The CHECKSUM
function returns the checksum value computed over a table row, or over an expression list. Use CHECKSUM
to build hash indexes.
Transact-SQL syntax conventions
Syntax
CHECKSUM ( * | expression [ ,...n ] )
Note
This syntax is not supported by serverless SQL pool in Azure Synapse Analytics.
Arguments
*
This argument specifies that the checksum computation covers all table columns. CHECKSUM
returns an error if any column has a noncomparable data type. Noncomparable data types include:
- cursor
- image
- ntext
- text
- XML
Another noncomparable data type is sql_variant with any one of the preceding data types as its base type.
expression
An expression of any type, except a noncomparable data type.
Return types
int
Remarks
CHECKSUM
computes a hash value, called the checksum, over its argument list. Use this hash value to build hash indexes. A hash index will result if the CHECKSUM
function has column arguments, and an index is built over the computed CHECKSUM
value. This can be used for equality searches over the columns.
The CHECKSUM
function satisfies hash function properties: CHECKSUM
applied over any two lists of expressions will return the same value, if the corresponding elements of the two lists have the same data type, and if those corresponding elements have equality when compared using the equals (=) operator. Null values of a specified type are defined to compare as equal for CHECKSUM
function purposes. If at least one of the values in the expression list changes, the list checksum will probably change. However, this is not guaranteed.
Therefore, to detect whether values have changed, we recommend use of CHECKSUM
only if your application can tolerate an occasional missed change. Otherwise, consider using HASHBYTES
instead. With a specified MD5 hash algorithm, the probability that HASHBYTES
will return the same result, for two different inputs, is much lower compared to CHECKSUM
.
CHECKSUM
ignores the nchar and nvarchar dash character (N'-'
or nchar(45)
). Therefore, a hash collision is guaranteed for any two strings where the only differences are dashes. Put another way, Select checksum(nchar(45));
and Select checksum(N'-');
both return a value of 0
, so they have no effect on the hash of any additional characters in the string or any additional data in the checksum list.
Practical problems:
- Checksum ignores negative signature in numeric string
SELECT CHECKSUM(N'1'), CHECKSUM(N'-1');
- A checksum comparison cannot detect that code was commented-out in stored proc definition
CREATE PROCEDURE Checksum_Test AS
BEGIN
RAISERROR('Error Raised',18,1);
RETURN 1;
END
GO
-- get checksum for original proc definition.
SELECT
checksum(definition),
definition
FROM sys.sql_modules
WHERE object_id = object_id('Checksum_Test');
GO
-- comment out a line of code in the proc.
ALTER PROCEDURE Checksum_Test AS
BEGIN
--RAISERROR('Error Raised',18,1);
RETURN 1;
END
GO
-- get checksum for altered proc definition. Note the definition text now includes the -- comment dashes.
SELECT
checksum(definition),
definition
FROM sys.sql_modules
WHERE object_id = object_id('Checksum_Test');
DROP PROCEDURE Checksum_Test
CHECKSUM
trims trailing spaces from nchar and nvarchar strings. The effect is the same as the problem of ignored dashes.
The expression order affects the computed CHECKSUM
value. The order of columns used for CHECKSUM(*)
is the order of columns specified in the table or view definition. This includes computed columns.
The CHECKSUM
value depends on the collation. The same value stored with a different collation will return a different CHECKSUM
value.
CHECKSUM ()
does not guarantee unique results.
Examples
These examples show the use of CHECKSUM
to build hash indexes.
To build the hash index, the first example adds a computed checksum column to the table we want to index. It then builds an index on the checksum column.
-- Create a checksum index.
SET ARITHABORT ON;
USE AdventureWorks2022;
GO
ALTER TABLE Production.Product
ADD cs_Pname AS CHECKSUM(Name);
GO
CREATE INDEX Pname_index ON Production.Product (cs_Pname);
GO
This example shows the use of a checksum index as a hash index. This can help improve indexing speed when the column to index is a long character column. The checksum index can be used for equality searches.
/*Use the index in a SELECT query. Add a second search
condition to catch stray cases where checksums match,
but the values are not the same.*/
SELECT *
FROM Production.Product
WHERE CHECKSUM(N'Bearing Ball') = cs_Pname
AND Name = N'Bearing Ball';
GO
Index creation on the computed column materializes the checksum column, and any changes to the ProductName
value will propagate to the checksum column. Alternatively, we could build an index directly on the column we want to index. However, for long key values, a regular index will probably not perform as well as a checksum index.
See also
CHECKSUM_AGG (Transact-SQL)
HASHBYTES (Transact-SQL)
BINARY_CHECKSUM (Transact-SQL)