Deleting documents from Azure cognitive search didn't free vector index size of the instance

Jingshu Chen (US) 0 Reputation points
2023-09-12T08:22:17.2466667+00:00

I tried to delete documents from the index using the code as below. However, even after I've deleted about 1.75M documents, the vector index size (usage) of the instance didn't decrease. Is this the right way to delete documents to free up the space?

Thanks!

search_client.delete_documents(documents=[{'id':document_id}])
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
727 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VenkateshDodda-MSFT 18,696 Reputation points Microsoft Employee
    2023-09-14T08:11:03.0466667+00:00

    @Jingshu Chen (US) Thanks for your patience on this.

    Yes, you are right that documents are first “soft deleted”. However, there’s no “retention policy”.

    It is mentioned in the vector search document here.

    When a document with a vector field is either deleted or updated (updates are internally represented as a delete and insert operation), the underlying document is marked as deleted and skipped during subsequent queries. As new documents are indexed and the internal vector index grows, the system cleans up these deleted documents and reclaims the resources. This means you'll likely observe a lag between deleting documents and the underlying resources being freed.

    Feel free to reach back to me if you have any further questions on this.