Share via

Best practices to handle RBAC or some form of control at document level for Azure AI Search Index

Siddhant Kumta 225 Reputation points
2025-12-22T16:05:44.27+00:00

Hello,

My current pipeline consists of uploading documents to an azure storage container, sending it to document intelligence for extraction, and then indexing those documents into an AI Search Index, the problem I want to learn about is how you would handle access control at the document level, where I want to be able to set permissions for individual documents. The goal is where certain users will only have access to certain documents in the index and only be able to search upon those. I want to know where in the process it would be best to implement these sorts of filters or if something like this is even possible. I know that there is some sort of RBAC being added that is in early stages.

Azure AI Search
Azure AI Search

An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.


Answer accepted by question author

  1. Marcin Policht 89,490 Reputation points MVP Volunteer Moderator
    2025-12-22T17:47:22.7466667+00:00

    What you’re describing is a very common requirement for enterprise search and RAG systems, and today it’s typically solved with application-level document security layered on top of Azure AI Search, rather than relying on native RBAC inside the search service.

    At a high level, the right place to enforce access control is not in Document Intelligence and not at query time inside the index alone, but in how you enrich and index documents and how your app issues queries. Extraction can stay exactly as it is. The key is to carry security metadata from your source system all the way into the index, and then always filter on it when querying.

    The usual pattern is to treat access control as just more searchable fields on each document. When you ingest a file from Blob Storage, you determine who should be allowed to see it, for example Azure AD object IDs for users and groups, tenant roles, department codes, or some custom ACL model. You then write those values into fields on the search document such as allowedUsers and allowedGroups. These fields are marked as filterable in the index schema. Document Intelligence extracts content, but your indexing step is where you attach the permissions.

    Once permissions are in the index, every query from your app includes a filter clause based on the caller’s identity. Your application authenticates the user with Entra ID, resolves their user ID and group memberships, and then builds a filter like allowedUsers/any(u: u eq '{userObjectId}') or allowedGroups/any(g: search.in(g, '{commaSeparatedGroupIds}')). Azure AI Search applies that filter before scoring, so users only ever see documents they are entitled to. This approach scales well and is how most production systems do per-document security today.

    That means the best place in your pipeline to implement this is during indexing, after extraction but before pushing into the index. The blob container can also have its own ACLs, but those are not automatically enforced by search. Think of Blob as your source of truth and AI Search as a projection that must carry forward the same access rules.

    Regarding document tagging, your app can map users to allowed tags and adds them as filters at query time. This is simpler than full ACL lists if your org already uses classification labels. A custom database for user-to-role or user-to-tag mapping can also be used, especially when you need logic that doesn’t live in Entra ID groups.

    Regarding native RBAC in Azure AI Search, AFAIK, this currently doesn’t provide per-document access.


    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    Was this answer helpful?


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.