question

ahmedsalah-1628 avatar image
0 Votes"
ahmedsalah-1628 asked LiHongMSFT-3908 commented

are using checksum for hashing numbers with sticks will make issue ?

i work on sql server 2017 i need to use check sum for hashing but im afraid if i got stuck or issue when compare

my data all will be numbers only or numbers with stick

example of my data

121
121|145
155|233|2231

so are check sum hashing with these number will make issue for me

when comparing for that numbers from database ?

sql-server-generalsql-server-transact-sql
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

While asking a question you need to provide a minimal reproducible example:
(1) DDL and sample data population, i.e. CREATE table(s) plus INSERT, T-SQL statements.
(2) What you need to do, i.e. logic, and your attempt implementation of it in T-SQL.
(3) Desired output based on the sample data in the #1 above.
(4) Your SQL Server version (SELECT @@version;)

0 Votes 0 ·
OlafHelper-2800 avatar image
0 Votes"
OlafHelper-2800 answered LiHongMSFT-3908 commented

Not clear what your issue is, may can you explain it more detailed, pülease?
A hash value is nothing unique, if it's that what you mean.

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

suppose i have table have column

idsticks as varchar(200) has value as below
121|233|112

and i need to store id for this text

so i will use checksum
as
checksum('121|233|112')
so are checksum are gurantee
or will have different issue when generating numbers for text

i I only will hashing numbers or numbers with stick

0 Votes 0 ·

Hi @ahmedsalah-1628

so are checksum are gurantee or will have different issue when generating numbers for text

Please refer to the link offered by Olaf, as you can see from the doc, CHECKSUM () does not guarantee unique results.

Best regards,
LiHong

0 Votes 0 ·
OlafHelper-2800 avatar image
1 Vote"
OlafHelper-2800 answered ErlandSommarskog commented

so are checksum are gurantee

the function CHECKSUM returns an integer value, so max 4 G different values, so as I wrote it's not a unique value, different values can return the same checkum.
See
https://docs.microsoft.com/en-us/sql/t-sql/functions/checksum-transact-sql?view=sql-server-ver15
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Here some titbits that I've calculated:

If you take 100000 random 32-bit numbers, the probality for at least one duplicate is 0.3.

If you take a million random 32-bit, the probabilty that all are unique is 1E-50, that is, extremely small.

To add insult to injury, checksum uses a simplistic XOR algorithm, which may increase the likelihood for duplicates.

0 Votes 0 ·
TomPhillips-1744 avatar image
1 Vote"
TomPhillips-1744 answered

"155|233|2231" is not "numbers with sticks", it is a string. SQL Server, and checksum, has no knowledge there are numbers separated by |.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.