How add source document as filter to vector index

Question

How add source document as filter to vector index

Sunil Nagireddy 105

We have implemented RAG frame work in Azure AI foundry. We have built vector index on top of 250 source pdf files. We are using prompt flow in azure ai foundry to build some custome logic for which we want to search vector index along with filter on source document name to meet our requirement.

i.e we want to make sure search will be performed on only the source file which was passed as filter along with user query.

We could see index schema under search service/index, but it is not allowing to modify the source file as filter, however it is allowing to add new field as filter. As schema is getting created automatically as part of index creation in azure ai foundry/Data+Indexes, we are not able to select source file name as filter in the process of creating vector index.

Can you please provide detail steps how we can create vector index with filter created on source file name in azure ai foundry

Shree Hima Bindu Maganti 4,775 Reputation points Microsoft External Staff Moderator

2025-06-09T17:55:03.37+00:00
Hi @Sunil Nagireddy
To create a vector index with a filter based on the source file name in Azure AI Foundry,

Include a field for the source file name in your vector index. Define this field in your index schema as a filterable field. While the schema may be created automatically, you can manually define the fields during the index creation process.

If the source file name is not automatically included as a filterable field, add a new field specifically for this purpose. Set the field's attributes to allow filtering.

When querying the vector index, use the filter parameter to specify the source file name. This restricts the search results to documents that match the provided source file name.

After setting up your index and adding the necessary fields, perform a test query to ensure that the filtering works as expected. You should see results only from the specified source file.

For detailed instructions on creating and modifying indexes, refer to the Azure documentation on search indexes and filters.

References:

Add a filter in a vector query in Azure AI Search

Schema of a search index

Add search fields to an index
If you have any further query do let us know.
Sunil Nagireddy 105 Reputation points

2025-06-10T05:36:23.67+00:00

Thank You Shree Hima Bindu Maganti. But I am not getting option of selecting fields for filter when i am creating vector index in ai foundry. Can you please give details where this option is available when creating vector index in ai foundry.
Suresh Chikkam 2,135 Reputation points Microsoft External Staff Moderator

2025-06-11T07:06:01.29+00:00

Hi Sunil Nagireddy,

Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes, if you have any further query do let us know.
Sunil Nagireddy 105 Reputation points

2025-06-12T03:11:13.49+00:00

Thank You Suresh Chikkam, answer is helpful. I was able to add filter to vector index when i refreshed ml job it has loaded the index sucessfully.

Accepted answer

0 additional answers

Your answer

Shree Hima Bindu Maganti 4,775 Reputation points Microsoft External Staff Moderator

2025-06-09T17:55:03.37+00:00

Hi @Sunil Nagireddy
To create a vector index with a filter based on the source file name in Azure AI Foundry,

Include a field for the source file name in your vector index. Define this field in your index schema as a filterable field. While the schema may be created automatically, you can manually define the fields during the index creation process.

If the source file name is not automatically included as a filterable field, add a new field specifically for this purpose. Set the field's attributes to allow filtering.

When querying the vector index, use the filter parameter to specify the source file name. This restricts the search results to documents that match the provided source file name.

After setting up your index and adding the necessary fields, perform a test query to ensure that the filtering works as expected. You should see results only from the specified source file.

For detailed instructions on creating and modifying indexes, refer to the Azure documentation on search indexes and filters.

References:

Add a filter in a vector query in Azure AI Search

Schema of a search index

Add search fields to an index
If you have any further query do let us know.
Sunil Nagireddy 105 Reputation points

2025-06-10T05:36:23.67+00:00

Thank You Shree Hima Bindu Maganti. But I am not getting option of selecting fields for filter when i am creating vector index in ai foundry. Can you please give details where this option is available when creating vector index in ai foundry.
Suresh Chikkam 2,135 Reputation points Microsoft External Staff Moderator

2025-06-11T07:06:01.29+00:00

Hi Sunil Nagireddy,

Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes, if you have any further query do let us know.
Sunil Nagireddy 105 Reputation points

2025-06-12T03:11:13.49+00:00

Thank You Suresh Chikkam, answer is helpful. I was able to add filter to vector index when i refreshed ml job it has loaded the index sucessfully.

Answer 1

Sunil Nagireddy, when you create a vector index in Azure AI Foundry, the schema is generated automatically and metadata fields like metadata_storage_name are not marked filterable and Foundry’s UI doesn’t let you change that. To restrict searches to a particular PDF, you need to edit the underlying Cognitive Search index schema so that there is a filterable file-name field, re-index your data, and then supply an OData filter in your vector query.

First, open your Azure Cognitive Search resource in the portal, go to Indexes, find the index that Foundry created for your PDFs, and export its JSON definition. In that JSON, add a new field called sourceFileName (or edit the existing metadata_storage_name field) so that it reads.

{
  "name": "sourceFileName",
  "type": "Edm.String",
  "searchable": false,
  "filterable": true,
  "retrievable": true,
  "sortable": false,
  "facetable": false
}

If you prefer to reuse the built-in metadata field, simply set its "filterable": true. Then recreate the index either by deleting the old one or giving the new one the same name using the portal or the Azure CLI.

az search index show \
  --name your-index-name \
  --service-name your-search-service \
  --resource-group your-rg > index.json
# edit index.json as above
az search index create \
  --name your-index-name \
  --service-name your-search-service \
  --resource-group your-rg \
  --body @index.json

Once the new schema is in place, rerun your indexer or re-execute the Foundry ingestion job so that every document chunk carries the sourceFileName value. You can confirm in Search Explorer that each document now shows the correct file name.

Finally, in your Prompt Flow’s vector-search (Index Lookup) step, paste an OData filter like:

sourceFileName eq 'MyDocument.pdf'

(or metadata_storage_name eq 'MyDocument.pdf' if you updated that field). This makes sure the search only matches embeddings from that specific PDF.

Hope it helps!

Please do not forget to click "Accept the answer” and Yes wherever the information provided helps you, this can be beneficial to other community members.

User's image

If you have any other questions or still running into more issues, let me know in the "comments" and I would be happy to help you.

Suresh Chikkam 2,135 Reputation points Microsoft External Staff Moderator

2025-06-12T03:18:50.2+00:00

Sunil Nagireddy, following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes, if you have any further query do let us know.

Share via

How add source document as filter to vector index

0 additional answers

Your answer