Using Tags in Semantic Search

Hardono Arifanto 0 Reputation points
2024-11-26T01:40:13.14+00:00

Hi, we're trying to use Azure Open AI search on our own sharepoint index to search for our organizational data. We have used the default mapping for the fields and have created one new field called "tags" to improve the searchability of certain documents. However, after implementing the tags field, and calling the azure open AI chat completions api, the results are the same as they would be without this new field. However, if we search directly on the index the search results are different and do in fact seem to make use of the tags field. Are there any other properties that we need to set when calling the Azure Open AI search API? Below is the index definition:

{
    "@odata.context": "https://url/$metadata#indexes/$entity",
    "@odata.etag": "\"0x8DD052B245C13A2\"",
    "name": "sharepoint-index-cis",
    "defaultScoringProfile": null,
    "fields": [
        {
            "name": "id",
            "type": "Edm.String",
            "searchable": false,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": true,
            "facetable": true,
            "key": true,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_name",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_path",
            "type": "Edm.String",
            "searchable": false,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_weburi",
            "type": "Edm.String",
            "searchable": false,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_content_type",
            "type": "Edm.String",
            "searchable": false,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": true,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_last_modified",
            "type": "Edm.DateTimeOffset",
            "searchable": false,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": true,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "metadata_spo_item_size",
            "type": "Edm.Int64",
            "searchable": false,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "content",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        },
        {
            "name": "tags",
            "type": "Collection(Edm.String)",
            "searchable": true,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": "standard.lucene",
            "normalizer": null,
            "dimensions": null,
            "vectorSearchProfile": null,
            "vectorEncoding": null,
            "synonymMaps": []
        }
    ],
    "scoringProfiles": [],
    "corsOptions": null,
    "suggesters": [],
    "analyzers": [],
    "normalizers": [],
    "tokenizers": [],
    "tokenFilters": [],
    "charFilters": [],
    "encryptionKey": null,
    "similarity": {
        "@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
        "k1": null,
        "b": null
    },
    "semantic": {
        "defaultConfiguration": null,
        "configurations": [
            {
                "name": "sp-semantic",
                "prioritizedFields": {
                    "titleField": {
                        "fieldName": "metadata_spo_item_name"
                    },
                    "prioritizedContentFields": [
                        {
                            "fieldName": "content"
                        }
                    ],
                    "prioritizedKeywordsFields": [
                        {
                            "fieldName": "tags"
                        }
                    ]
                }
            }
        ]
    },
    "vectorSearch": null
}

When we call the https://url/openai/deployments/oai-gpt4o/chat/completions?api-version=2024-06-01 end point, and include the all_retrieved_documents, it uses the search query "authorisation limits for bad debt write-off" and returns Document A. 

However, if we search directly on the index using the https://url/indexes/sharepoint-index-cis/docs/search?api-version=2024-05-01-preview end point, for the same search query "authorisation limits for bad debt write-off", it returns Document B (which is correct because it has the authorisation limits tag)

Why does the Azure Open AI chat completions endpoint also not return Document B?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,339 questions
Microsoft 365 and Office SharePoint For business Windows
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Chakaravarthi Rangarajan Bhargavi 1,115 Reputation points MVP
    2025-04-21T04:18:53.2433333+00:00

    Hi Hardono Arifanto,

    Welcome to the Microsoft Q&A forum. Thanks for your question.

    You're using Azure AI Search on a SharePoint index to search for organizational data, and you've introduced a custom "tags" field to improve searchability. However, you're encountering an issue where the Azure OpenAI Chat Completions API does not return Document B (which has the "authorization limits" tag), while searching directly on the SharePoint index returns the correct document.

    Below could be the possible reasons and solutions:

    1. Semantic Configuration Not Fully Honored by OpenAI API:

    You’ve configured semantic search in your SharePoint index with the tags field under prioritizedKeywordsFields. This field should be prioritized in searches. However, the Azure OpenAI Chat Completions API may not be fully honoring the semantic configuration when it retrieves documents.

    To address this, ensure that your API request is correctly using the semantic configuration and prioritizing the tags field.

    Semantic Search Overview

    1. Scoring Profiles Not Applied:

    In your index definition, there are no scoring profiles defined. Scoring profiles can be used to enhance the ranking of documents based on fields like "tags."

    Consider defining a scoring profile where the tags field is given higher weight to improve the search relevance and ensure that documents like Document B are ranked appropriately.

    Add Scoring Profiles in Azure AI Search

    1. Field Indexing Configuration:

    In your index definition, the tags field is marked as searchable and retrievable, which should make it available in search queries. Ensure that it’s indexed properly to allow effective search ranking.

    Learn how to define and configure fields in Azure AI Search

    1. Search Parameters in the OpenAI API Call:

    Double-check the parameters you’re using when calling the Azure OpenAI Chat Completions API. Specifically, ensure you're passing the correct parameters to make use of semantic search and that it uses the tags field when ranking documents.

    If necessary, refine your query to ensure it's configured to prioritize the tags field in the OpenAI API call.

    Ensure that your Azure AI Search and Azure OpenAI services are correctly integrated. You might need to explicitly reference fields like tags in your API call for accurate results.

    Integrate Azure AI Search with OpenAI API

    Exploring Advanced Tagging with Syntex:

    If you’re leveraging Microsoft Syntex for automatic metadata tagging, ensure that your tags are being applied correctly at the document level and are indexed in SharePoint for use in search.

    Overview of Taxonomy Tagging in Microsoft Syntex

    Optimizing Search Ranking:

    To further optimize search ranking, consider using a scoring profile and boosting the relevance of tags, content type, and other fields that are critical to your search scenario.

    Scoring Profiles in Azure AI Search

    Use Retrieval-Augmented Generation:

    Consider incorporating Retrieval-Augmented Generation (RAG) techniques to further refine your search. This combines the power of AI search with a generative model, helping provide more accurate and contextually relevant completions.

    Retrieval-Augmented Generation - Azure AI Foundry

    To resolve your issue, please ensure your semantic search configuration is being fully respected in the OpenAI API call or defining a scoring profile that prioritizes the tags field. or consider reviewing and adjusting your API search parameters to ensure correct results based on your SharePoint index and tags field.

    Once these steps are applied, your OpenAI API search should return Document B as expected, using the tags field to rank results more accurately.

    Feel free to ask if you need more assistance or specific configuration steps.

    Regards,

    Chakravarthi Rangarajan Bhargavi

    - If this answer helped, please click 'Yes' and accept the answer to help others in the community. Thank you! 😊

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.