How to identify similar numbers from database before insert

Abhinav Reddy Muthyala 1 Reputation point
2022-11-19T17:10:34.037+00:00

I have a table with 20 million rows of phone numbers, which is synced to azure cognitive search. I need to search before creating a new row. Lets say I have a record with number "1234567890" if a request comes with number "1234567899" or "1234557890", this request is slightly similar to the one which is already in db. Is there any way to do that in MSSQL or Cognitive search or in Azure or any other tool.

Thanks in advance

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
701 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. SnehaAgrawal-MSFT 18,286 Reputation points
    2022-11-22T17:21:48.843+00:00

    Thanks for reaching here! Could you share more details, on the structure of the data. And the use-case.

    One thing you could possibly do is fuzzy search and define default distance as 1 or 2 to find out similar matches. But it can get tricky beyond a certain distance. Plus the whole operation would make indexing pretty slow, because you will check all the records before inserting.

    Check this document on Fuzzy search - Azure Cognitive Search | Microsoft Learn

    Doing the exact phone number might be easier, if you make that as the ID for the index. Otherwise Fuzzy search can help.

    Let us know.

    0 comments No comments