Accent sensitive search in fulltext catalog problem

Amila Samarakoon 1 Reputation point
2020-10-08T08:31:22.98+00:00

I am currently working on full text search functionality of web application and use accent sensitive search. I am using MSSQL server and create full text catalog for that. and Swedish is the language that use to word break. My problem is when i search "blodstatu", the result was "B-Blodstatus" but if i search "blodstatus" the result was empty. is there any solution for that

Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,651 questions
{count} votes

3 answers

Sort by: Most helpful
  1. EchoLiu-MSFT 14,581 Reputation points
    2020-10-08T08:59:33.763+00:00

    Hi @Amila Samarakoon ,

    If you use the CONTAINS keyword when searching, please refer to the related document CONTAINS (Transact-SQL). There are several examples at the end of the document, you can refer to.

    The following example may be useful to you:

    --returns all products that contain either the phrase Mountain or Road  
    SELECT Name    
    FROM Production.Product    
    WHERE CONTAINS(Name, ' Mountain OR Road ');   
      
    --The following example returns all product names with at least one word starting with the prefix chain in the Name column  
    SELECT Name    
    FROM Production.Product    
    WHERE CONTAINS(Name, ' "Chain*" ');    
    

    If you have any question, please feel free to let me know.
    If the response is helpful, please click "Accept Answer" and upvote it.

    Regards
    Echo


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentationto enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Olaf Helper 44,936 Reputation points
    2020-10-08T10:19:00.073+00:00

    Is the FT index already completely populated and does it contain that word?
    The FTS parser returns a well result

    select * from sys.dm_fts_parser('blodstatu', 1053, 0, 1);
    select * from sys.dm_fts_parser('blodstatus', 1053, 0, 1);
    
    0 comments No comments

  3. Erland Sommarskog 112.6K Reputation points MVP
    2020-10-10T18:59:21.363+00:00

    I did some research, and I can't say that I know exactly what is going on.

    Searching for "blodstatu" with CONTAINS gave no hits at all, but searching with FREETEXT, I got a hit on "blodstatus" alone as well as on "B-Blodstatus". When I searched for "blodstatus" with CONTAINS, I only got a hit for "blodstatus", but not "B-Blodstatus". With FREETEXT, I got a hit on both. Interesting enough, when I set the language to English, I got a hit on both "B-Blodstatus" and "blodstatus" when searching for "blodstatus".

    The fact that "blodstatu" yields a hit may seem strange, but -s is a normal Swedish genitive form, so the word breaker goes with that. ("blodstatu" is not a word, "blodstatus" is the same as "blood status" in English.

    My script is below. It includes a few more words that I tested.

    CREATE TABLE MerTextÅtFolket(id int NOT NULL IDENTITY ,
                                 CONSTRAINT pk PRIMARY KEY(id),
                                 text nvarchar(MAX) NOT NULL)
    go
    INSERT MerTextÅtFolket(text)
      VALUES('Det finns de som talar om B-Blodstatus, men inte tittar åt vänster.'),
            ('Sedan finns det de som tittar åt höger, men talar om kaffesumpar.'),
            ('Ytterligare andra tittar rakt fram och ser en maskros flyga'),
            ('Sedan finns det ju annan blodstatus också'),
            ('Vi får nog överlägga om saken och återkomma'),
            ('Detta är något för getens vänner att fundera över.'),
            ('Vi ska inte förneka lingons förmåga att överleva i vildmarken'),
            ('Detta är en vedervärdigt dålig C-maskros! Det tycker alla getter.')
    go
    CREATE FULLTEXT CATALOG fulltext WITH ACCENT_SENSITIVITY = ON AS DEFAULT
    CREATE FULLTEXT INDEX ON MerTextÅtFolket(text LANGUAGE Swedish) KEY INDEX pk
    go
    WAITFOR DELAY '00:00:02'
    go
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'blodstatu')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'blodstatus')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'B-blodstatus')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'maskros')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'maskro')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'get')
    SELECT * FROM MerTextÅtFolket WHERE CONTAINS(text, 'lingon')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'blodstatu')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'blodstatus')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'B-blodstatus')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'maskros')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'maskro')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'get')
    SELECT * FROM MerTextÅtFolket WHERE FREETEXT(text, 'lingon')
    SELECT * FROM sys.dm_fts_index_keywords_by_document(db_id(), object_id('MerTextÅtFolket'))
    go
    DROP FULLTEXT INDEX ON MerTextÅtFolket
    DROP FULLTEXT CATALOG fulltext
    DROP TABLE MerTextÅtFolket
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.