Hi @Aravind Vijay,
Currently, Azure Cognitive Search retrieves a specified number of top documents based on relevance. Directly integrating your Streamlit chatbot with the Azure Search Index to access more than the top N documents isn't natively supported. However, you can implement pagination to retrieve additional documents beyond the initial set. This involves making successive search queries with appropriate skip
and top
parameters to navigate through the result set. By aggregating these results, you can provide your chatbot with a broader context. https://learn.microsoft.com/en-us/azure/search/search-pagination-page-layout
To remove documents where the date_ymd
field matches a specific date, you'll need to perform a two-step process. Since Azure Cognitive Search requires the document's key field (e.g., hardware_id
) for deletion, you must first query the index to obtain the keys of documents matching your date_ymd
criteria. Once you have the list of keys, you can issue delete operations for those specific documents.
Here's how you can implement this in Python using the Azure Search SDK:
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
# Initialize the SearchClient
service_endpoint = "https://<your-service-name>.search.windows.net"
index_name = "<your-index-name>"
api_key = "<your-api-key>"
search_client = SearchClient(service_endpoint, index_name, AzureKeyCredential(api_key))
def delete_documents_by_date(date_ymd):
# Step 1: Retrieve documents with the specified date_ymd
filter_expression = f"date_ymd eq '{date_ymd}'"
results = search_client.search(search_text="", filter=filter_expression, select=["hardware_id"])
# Step 2: Collect the keys of the documents to be deleted
documents_to_delete = [{"hardware_id": doc["hardware_id"]} for doc in results]
# Step 3: Delete the documents in batches
if documents_to_delete:
batch_size = 1000 # Adjust batch size as needed
for i in range(0, len(documents_to_delete), batch_size):
batch = documents_to_delete[i:i + batch_size]
for doc in batch:
doc["@search.action"] = "delete"
search_client.upload_documents(documents=batch)
print(f"Deleted batch of {len(batch)} documents.")
else:
print("No documents found with the specified date.")
Since your date_ymd
field is of string type, make sure that the format of the date in your query is identical to the format in your index (i.e., 'YYYY-MM-DD'). Azure Search imposes batch size limits. It's recommended to execute deletions in batches (e.g., 1,000 documents per batch) so as not to exceed these limits. Deletions are executed asynchronously. There may be a slight delay before the updates are applied in the index.
Refer to the Azure AI Search documentation on adding, updating, or deleting documents for better understanding.
Hope the above provided information help you resolve the issue, if you have any further concerns or queries, please feel free to reach out to us.