Azure OpenAI Response Citations Not Matching Indexed Content in Azure AI Search

Sree Krishna Suresh 0 Reputation points
2024-11-05T18:44:05.67+00:00

When using the OpenAI on your data feature with Azure, the citations returned do not correspond to the content of the source documents indexed in Azure AI Search. HTML tables are indexed along with text, and discrepancies occur when querying. Specifically, the citations do not align with the content for segments that include <html> tags.

For example:

Azure OpenAI response citations -

Company
Location
Products

Xyz, Inc.
Brea, California
Bondstrand 2000

ABC Chemicals, Inc.
Chicago,

Azure AI Search response -

<table>
<tr>
<th>Company</th>
<th>Location</th>
<th>Products</th>
</tr>
<tr>
<td>Xyz, Inc.</td>
<td>Brea, California</td>
<td>Bondstrand 2000</td>
</tr>
<tr>
<td>ABC Chemicals, Inc.</td>
<td>Chicago,

Both responses refer to the same chunk of text. What could cause this inconsistency?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,103 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,379 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Shree Hima Bindu Maganti 1,160 Reputation points Microsoft Vendor
    2024-11-25T08:43:32.0766667+00:00

    Hi @Sree Krishna Suresh ,
    Welcome to the Microsoft Q&A Platform!
    When using Azure OpenAI on your data, the citations can mismatch because of differences in how Azure AI Search and OpenAI process the same data.

    1. Azure AI Search may index data exactly as it appears (e.g., keeping <table> tags).
    2. OpenAI processes the data differently, often converting it to plain text for easier understanding. This causes the same content to "look" different in the two systems.
    3. Azure AI Search divides content into small chunks for storage.
    4. If the chunk sizes or boundaries differ between Azure AI Search and OpenAI’s processing, the systems may refer to slightly different parts of the document

    Steps to resolve citations:

    1. Make sure the same version of the data (e.g., plain text without HTML tags) is used by both Azure AI Search and OpenAI.
    2. Adjust how Azure AI Search chunks data, so it matches how OpenAI processes content.
      https://learn.microsoft.com/en-us/azure/ai-services/openai/azure-government
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.