Azure Cognitive Search Still Indexes Deleted Blob Despite Using Native Soft Delete Method

Su Myat Hlaing 155 Reputation points
2024-07-17T12:34:17.9766667+00:00

I’m encountering an issue where a file I deleted from my Azure Storage account (using the Containers Explorer) is still appearing in my Azure Cognitive Search results. I’m using the native soft delete method as described in the Azure documentation.

Steps Taken:

  1. Deleted File: I deleted the file from Azure Blob Storage using the Containers Explorer.
  2. Indexer and Index Configuration: Followed the instructions to set up the native soft delete method. Recreated both the index and indexer as per the guidelines provided. Verified that the deletion detection strategy was applied from the initial indexer run.

Issue:

Despite these steps, the deleted file still appears in search results. According to the documentation, the deletion detection policy should be effective from the start of the indexer run. However, the file remains indexed and searchable.

Question:

  1. Retention Period: Could the retention period for soft delete be affecting the immediate removal of the file? How does this impact the deletion process and indexing?
  2. Other Considerations: Are there additional configurations or steps I might be missing to ensure that files are removed from the index immediately upon deletion?

Any insights or suggestions to resolve this issue would be greatly appreciated!

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
994 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,842 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Nehruji R 7,801 Reputation points Microsoft Vendor
    2024-07-18T11:41:37.3766667+00:00

    Hello Su Myat Hlaing,

    Greetings! Welcome to Microsoft Q&A Platform.

    I understand that you have set up the native soft delete method for your Azure Cognitive Search. However, there are a few factors that might be affecting the immediate removal of the deleted file from your search results,

    It may occur, if you have not implemented a deletion policy for your indexer. Please refer to Changed and deleted blobs - Azure AI Search | Microsoft Learn.

    The deletion detection strategy should be applied from the very first indexer run. If you didn't establish the deletion policy prior to the initial run, any documents that were deleted before the policy was implemented will remain in your index, even if you add the policy to the indexer later and reset it. If this has occurred, it is suggested that you create a new index using a new indexer, ensuring the deletion policy is in place from the beginning.

    • Blobs must be in an Azure Blob Storage container. The Azure AI Search native blob soft delete policy isn't supported for blobs in ADLS Gen2 or Azure Files.
    • Enable soft delete for blobs.
    • Document keys for the documents in your index must be mapped to either be a blob property or blob metadata, such as "metadata_storage_path".
    • You must use the REST API (api-version=2023-11-01) or newer version, or the indexer Data Source configuration in the Azure portal, to configure support for soft delete.
    • Blob versioning must not be enabled in the storage account. Otherwise, native soft delete isn't supported by design.

    By following this "soft delete" strategy, you can keep your search index synchronized with your data sources, ensuring that it only includes existing and relevant documents.

    Make sure your indexer is scheduled to run frequently enough to pick up changes, including deletions. If you need triggered indexing, you will have to build it using Azure logic app Refer - Azure Logic App run trigger when Blob is added or modified - Stack Overflow)

    OR Azure Blob storage trigger for Azure Functions | Microsoft Learn

    And then using the Push API (Data import and data ingestion - Azure Cognitive Search | Microsoft Learn) so you build your own indexer.

    Similar thread for reference - https://learn.microsoft.com/en-us/answers/questions/1388306/azure-cognitive-search-index-is-not-updating-the-d.

    The retention period for soft delete in Azure Blob Storage specifies how long the deleted data remains available before it’s permanently deleted. This period can range from 1 to 365 days. During this time, the blob is marked as soft deleted but still exists in the storage account, which might affect the indexing process. If the indexer runs during this period, it might still detect the blob as present. If the retention period is too long, consider adjusting it to a shorter duration if it aligns with your data protection policies. Check the indexer logs for any errors or warnings that might indicate why the deleted file is still appearing in search results.

    Hope this information helps! please let us know if you have any further queries. I’m happy to assist you further.


    Please "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.