synonym doesn't work with complex field

chris.c.zhao chris.c.zhao 25 Reputation points
2024-03-06T22:50:10.77+00:00

synonym doesn't work with complex field.

here is my field definition:

fields = [
        SimpleField(name="Doc_ID", type=SearchFieldDataType.String, key=True),
        SearchableField(name="Title", type=SearchFieldDataType.String, analyzer_name="en.microsoft",sortable=True, facetable=True,synonym_map_names=[synonym_map_name]),

        SearchableField(name="Doc_Type_Cd", type=SearchFieldDataType.String, analyzer_name="en.microsoft",sortable=True, facetable=True, synonym_map_names=[synonym_map_name]),
        
        ComplexField(name="Ocr_Merged", collection=True,fields=[
            SearchableField(name="PageNum", type=SearchFieldDataType.String),
            SearchableField(name="PageText", type=SearchFieldDataType.String, analyzer_name=analyzer_name, synonymMaps=[synonym_map_name] )
        ]),

here is synonym  map definition:

['AFFILIATED,pig', 'COR,CHK,hello kitty', 'COR,PRD=>COR,', 'USA, United States, United States of America', 'Department,dept,dept.', 'Department,dept,dept.=>Department', '']

for 

'USA, United States, United States of America', 'Department,dept,dept.', 'Department,dept,dept.=>Department'
, are for complex field  Ocr_Merged/PageText. it doesn't work


'COR,CHK,hello kitty': hello kitty doesn't work for field 
Doc_Type_Cd. chk works as synonyme of cor.


'AFFILIATED,pig' works for field Title.

is there anything special to config synonym for complex field? hello kitty doesn't work for non complex field. 
 

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,342 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Grmacjon-MSFT 19,151 Reputation points Moderator
    2024-04-09T20:33:21.22+00:00

    Hello @chris.c.zhao chris.c.zhao without many details about your setup here is a general breakdown of how to fix it for both complex and non-complex fields:

    Complex Field (Ocr_Merged/PageText):

    The problem is that synonyms are applied after the text is extracted from the complex field. Since "USA," "United States," and "United States of America" are likely separate words during extraction, the synonym map won't be able to match them for replacement.

    Possible Solution:

    1. Pre-process Text During Indexing**:** If you have control over the indexing process, consider pre-processing the text before indexing the complex field. You can achieve this by: Using a custom skill in your indexer to perform synonym replacements before storing the text in the "PageText" field.
      Modifying your data source to combine these variations into a single term before sending it to Azure AI Search.
      
    2. Use Keyword Highlighting (Optional): If pre-processing isn't feasible, Azure AI Search offers keyword highlighting during search. You can define "USA," "United States," and "United States of America" as synonyms and leverage keyword highlighting to visually emphasize these terms in search results even if they weren't directly matched.

    Non-Complex Field (Doc_Type_Cd):

    The issue with "hello kitty" not working as a synonym for "COR" in the "Doc_Type_Cd" field might be due to a few reasons:

    1. Typos**:** Double-check the synonym map definition for any typos. Ensure it's "COR,CHK,hello_kitty" (replace spaces with underscores).
    2. Case Sensitivity: By default, synonym maps are case-sensitive. If "COR" in your data is uppercase, the synonym map won't match lowercase "cor". Try adding synonyms with different capitalizations (e.g., "Cor," "COR").
    3. Analyzer Impact: The "en.microsoft" analyzer might be splitting "hello kitty" into separate tokens ("hello" and "kitty") during indexing. This would prevent synonym matching. Consider creating a custom analyzer that preserves phrases like "hello kitty" or using a synonym map entry specifically for "hello_kitty".

    Lastly, I recommend using the Azure AI Search REST API or a search client library to test your synonym maps with various queries and data samples. Also, review Azure AI Search query logs to identify any potential issues with synonym matching or unexpected behavior.

    Hope that helps.

    -Grace

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.