Get started with chat document security for Python
When you build a chat application using the RAG pattern with your own data, make sure that each user receives an answer based on their permissions. Follow the process in this article to add document access control to your chat app.
An authorized user should have access to answers contained within the documents of the chat app.
An unauthorized user shouldn't have access to answers from secured documents they don't have authorization to see.
Note
This article uses one or more AI app templates as the basis for the examples and guidance in the article. AI app templates provide you with well-maintained, easy to deploy reference implementations that help to ensure a high-quality starting point for your AI apps.
Architectural overview
Without document security feature, the enterprise chat app has a simple architecture using Azure AI Search and Azure OpenAI. An answer is determined from queries to Azure AI Search where the documents are stored, in combination with a response from an Azure OpenAI GPT model. No user authentication is used in this simple flow.
To add security for the documents, you need to update the enterprise chat app:
- Add client authentication to the chat app with Microsoft Entra.
- Add server-side logic to populate a search index with user and group access.
Azure AI Search doesn't provide native document-level permissions and can't vary search results from within an index by user permissions. Instead, your application can use search filters to ensure a document is accessible to a specific user or by a specific group. Within your search index, each document should have a filterable field that stores user or group identity information.
Because the authorization isn't natively contained in Azure AI Search, you need to add a field to hold user or group information, then filter any documents that don't match. To implement this technique, you need to:
- Create a document access control field in your index dedicated to storing the details of users or groups with document access.
- Populate the document's access control field with the relevant user or group details.
- Update this access control field whenever there are changes in user or group access permissions.
- If your index updates are scheduled with an indexer, changes are picked up on the next indexer run. If you don't use an indexer, you need to manually reindex.
In this article, the process of securing documents in Azure AI Search is made possible with example scripts, which you as the search administrator would run. The scripts associate a single document with a single user identity. You can take these scripts and apply your own security and productionizing requirements to scale to your needs.
Determine security configuration
The solution provides boolean environment variables to turn on features necessary for document security in this sample.
Parameter | Purpose |
---|---|
AZURE_USE_AUTHENTICATION |
When set to true , enables user sign-in to the chat app and App Service authentication. Enables Use oid security filter in the chat app Developer settings. |
AZURE_ENFORCE_ACCESS_CONTROL |
When set to true , requires authentication for any document access. The Developer settings for oid and group security will be turned on and disabled so they can't be disabled from the UI. |
AZURE_ENABLE_GLOBAL_DOCUMENTS_ACCESS |
When set to true , this setting allows authenticated users to search on documents that have no access controls assigned, even when access control is required. This parameter should only be used when AZURE_ENFORCE_ACCESS_CONTROL is enabled. |
AZURE_ENABLE_UNAUTHENTICATED_ACCESS |
When set to true , this setting allows unauthenticated users to use the app, even when access control is enforced. This parameter should only be used when AZURE_ENFORCE_ACCESS_CONTROL is enabled. |
Use the following sections to understand the security profiles supported in this sample. This article configures the Enterprise profile.
Enterprise: Required account + document filter
Each user of the site must sign in, and the site does contain content which is public to all users. The document level security filter is applied to all requests.
Environment variables:
- AZURE_USE_AUTHENTICATION=true
- AZURE_ENABLE_GLOBAL_DOCUMENTS_ACCESS=true
- AZURE_ENFORCE_ACCESS_CONTROL=true
Mixed use: Optional account + document filter
Each user of the site may sign in, and the site does contain content which is public to all users. The document level security filter is applied to all requests.
Environment variables:
- AZURE_USE_AUTHENTICATION=true
- AZURE_ENABLE_GLOBAL_DOCUMENTS_ACCESS=true
- AZURE_ENFORCE_ACCESS_CONTROL=true
- AZURE_ENABLE_UNAUTHENTICATED_ACCESS=true
Prerequisites
A development container environment is available with all dependencies required to complete this article. You can run the development container in GitHub Codespaces (in a browser) or locally using Visual Studio Code.
To use this article, you need the following prerequisites:
- Azure subscription. Create one for free
- Azure account permissions - Your Azure Account must have
- Permission to manage applications in Microsoft Entra ID.
- Microsoft.Authorization/roleAssignments/write permissions, such as User Access Administrator or Owner.
- Access granted to Azure OpenAI in the desired Azure subscription. Currently, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/access.
You need more prerequisites depending on your preferred development environment.
Open development environment
Begin now with a development environment that has all the dependencies installed to complete this article.
GitHub Codespaces runs a development container managed by GitHub with Visual Studio Code for the Web as the user interface. For the most straightforward development environment, use GitHub Codespaces so that you have the correct developer tools and dependencies preinstalled to complete this article.
Important
All GitHub accounts can use Codespaces for up to 60 hours free each month with 2 core instances. For more information, see GitHub Codespaces monthly included storage and core hours.
Start the process to create a new GitHub Codespace on the
main
branch of theAzure-Samples/azure-search-openai-demo
GitHub repository.Right-click on the following button, and select Open link in new windows in order to have both the development environment and the documentation available at the same time.
On the Create codespace page, review the codespace configuration settings and then select Create new codespace
Wait for the codespace to start. This startup process can take a few minutes.
In the terminal at the bottom of the screen, sign in to Azure with the Azure Developer CLI.
azd auth login
Complete the authentication process.
The remaining tasks in this article take place in the context of this development container.
Get required information with Azure CLI
Get your subscription ID and tenant ID with the following Azure CLI command. Copy the value to use as your AZURE_TENANT_ID
.
az account list --query "[].{subscription_id:id, name:name, tenantId:tenantId}" -o table
If you get an error about your tenant's conditional access policy, you need a second tenant without a conditional access policy.
- Your first tenant, associated with your user account, is used for the
AZURE_TENANT_ID
environment variable. - Your second tenant, without conditional access, is used for the
AZURE_AUTH_TENANT_ID
environment variable to access Microsoft Graph. For tenants with a conditional access policy, find the ID of a second tenant without a conditional access policy or create a new tenant.
Set environment variables
Run the following commands to configure the application for the Enterprise profile.
azd env set AZURE_USE_AUTHENTICATION true azd env set AZURE_ENABLE_GLOBAL_DOCUMENTS_ACCESS true azd env set AZURE_ENFORCE_ACCESS_CONTROL true
Run the following command to set the tenant, which authorizes the user sign in to the hosted application environment. Replace
<YOUR_TENANT_ID>
with the tenant ID.azd env set AZURE_TENANT_ID <YOUR_TENANT_ID>
Note
If you have a conditional access policy on your user tenant, you need to specify an authentication tenant.
Deploy chat app to Azure
Deployment includes creating the Azure resources, uploading the documents, creating the Microsoft Entra identity apps (client & server), and turning on identity for the hosting resource.
Run the following Azure Developer CLI command to provision the Azure resources and deploy the source code:
azd up
Use the following table to answer the AZD deployment prompts:
Prompt Answer Environment name Use a short name with identifying information such as your alias and app: tjones-secure-chat
.Subscription Select a subscription to create the resources in. Location for Azure resources Select a location near you. Location for documentIntelligentResourceGroupLocation
Select a location near you. Location for openAIResourceGroupLocation
Select a location near you. Wait 5 or 10 minutes after the app is deployed to allow the app to start up.
After the application has been successfully deployed, you see a URL displayed in the terminal.
Select that URL labeled
(✓) Done: Deploying service webapp
to open the chat application in a browser.Agree to the app authentication pop-up.
When the chat app is displayed, notice in the top right corner that your user is signed in.
Open Developer settings and notice both these options are selected and greyed out (disabled for change).
- Use oid security filter
- Use groups security filter
Select the card with
What does a product manager do?
.You get an answer like:
The provided sources do not contain specific information about the role of a Product Manager at Contoso Electronics.
Open access to a document for a user
Turn on your permissions for the exact document so you can get the answer. These require several pieces of information:
- Azure Storage
- Account name
- Container name
- Blob/document URL for
role_library.pdf
- User's ID in Microsoft Entra ID
Once this information is known, update the Azure AI Search index oids
field for the role_library.pdf
document.
Get the URL for a document in storage
In the
.azure
folder at the root of the project, find the environment directory, and open the.env
file with that directory.Search for the
AZURE_STORAGE_ACCOUNT
entry and copy its value.Use the following Azure CLI commands to get the URL of the role_library.pdf blob in the content container.
az storage blob url \ --account-name <REPLACE_WITH_AZURE_STORAGE_ACCOUNT \ --container-name 'content' \ --name 'role_library.pdf'
Parameter Purpose --account-name Azure Storage account name --container-name The container name in this sample is content
--name The blob name in this step is role_library.pdf
Copy the blob URL to use later.
Get your user ID
- In the chap app, select Developer settings.
- In the ID Token claims section, copy your
objectidentifier
. This is known in the next section as theUSER_OBJECT_ID
.
Provide user access to a document in Azure Search
Use the following script to change the
oids
field in Azure AI Search for role_library.pdf so you have access to it../scripts/manageacl.sh \ -v \ --acl-type oids \ --acl-action add \ --acl <REPLACE_WITH_YOUR_USER_OBJECT_ID> \ --url <REPLACE_WITH_YOUR_DOCUMENT_URL>
Parameter Purpose -v Verbose output. --acl-type Group or user object IDs (OIDs): oids
--acl-action Add to a Search index field. Other options include remove
,remove_all
,list
.--acl Group or user's USER_OBJECT_ID
--url The file's location in Azure storage, such as https://MYSTORAGENAME.blob.core.windows.net/content/role_library.pdf
. Don't surround URL with quotes in the CLI command.The console output for this command looks like:
Loading azd .env file from current environment... Creating Python virtual environment "app/backend/.venv"... Installing dependencies from "requirements.txt" into virtual environment (in quiet mode)... Running manageacl.py. Arguments to script: -v --acl-type oids --acl-action add --acl 00000000-0000-0000-0000-000000000000 --url https://mystorage.blob.core.windows.net/content/role_library.pdf Found 58 search documents with storageUrl https://mystorage.blob.core.windows.net/content/role_library.pdf Adding acl 00000000-0000-0000-0000-000000000000 to 58 search documents
Optionally, use the following command to verify your permission is listed for the file in Azure AI Search.
./scripts/manageacl.sh \ -v \ --acl-type oids \ --acl-action list \ --acl <REPLACE_WITH_YOUR_USER_OBJECT_ID> \ --url <REPLACE_WITH_YOUR_DOCUMENT_URL>
Parameter Purpose -v Verbose output. --acl-type Group or user (oids): oids
--acl-action List a Search index field oids
. Other options includeremove
,remove_all
,list
.--acl Group or user's USER_OBJECT_ID
--url The file's location in Azure storage, such as https://MYSTORAGENAME.blob.core.windows.net/content/role_library.pdf
. Don't surround URL with quotes in the CLI command.The console output for this command looks like:
Loading azd .env file from current environment... Creating Python virtual environment "app/backend/.venv"... Installing dependencies from "requirements.txt" into virtual environment (in quiet mode)... Running manageacl.py. Arguments to script: -v --acl-type oids --acl-action view --acl 00000000-0000-0000-0000-000000000000 --url https://mystorage.blob.core.windows.net/content/role_library.pdf Found 58 search documents with storageUrl https://mystorage.blob.core.windows.net/content/role_library.pdf [00000000-0000-0000-0000-000000000000]
The array at the end of the output includes your USER_OBJECT_ID and is used to determine if the document is used in the answer with Azure OpenAI.
Verify Azure AI Search contains your USER_OBJECT_ID
Open the Azure portal and search for your
AI Search
.Select your search resource from the list.
Select Search management -> Indexes.
Select the gptkbindex.
Select View -> JSON view.
Replace the JSON with the following JSON.
{ "search": "*", "select": "sourcefile, oids", "filter": "oids/any()" }
This searches all documents where the
oids
field has any value and returns thesourcefile
, andoids
fields.If the
role_library.pdf
doesn't have your oid, return to the Provide user access to a document in Azure Search section and complete the steps.
Verify user access to the document
If you completed the steps but didn't see the correct answer, verify your USER_OBJECT_ID is set correctly in Azure AI Search for that role_library.pdf
.
Return to the chat app. You may need to sign in again.
Enter the same query so that the
role_library
content is used in the Azure OpenAI answer:What does a product manager do?
.View the result, which now includes the appropriate answer from the role library document.
Clean up resources
Clean up Azure resources
The Azure resources created in this article are billed to your Azure subscription. If you don't expect to need these resources in the future, delete them to avoid incurring more charges.
Run the following Azure Developer CLI command to delete the Azure resources and remove the source code:
azd down --purge
Clean up GitHub Codespaces
Deleting the GitHub Codespaces environment ensures that you can maximize the amount of free per-core hours entitlement you get for your account.
Important
For more information about your GitHub account's entitlements, see GitHub Codespaces monthly included storage and core hours.
Sign into the GitHub Codespaces dashboard (https://github.com/codespaces).
Locate your currently running Codespaces sourced from the
Azure-Samples/azure-search-openai-demo
GitHub repository.Open the context menu for the codespace and then select Delete.
Get help
This sample repository offers troubleshooting information.
Troubleshooting
This section offers troubleshooting for issues specific to this article.
Provide authentication tenant
When your authentication is in a separate tenant from your hosting application, you need to set that authentication tenant with the following process.
Run the following command to configure the sample to use a second tenant for the authentication tenant.
azd env set AZURE_AUTH_TENANT_ID <REPLACE-WITH-YOUR-TENANT-ID>
Parameter Purpose AZURE_AUTH_TENANT_ID
If AZURE_AUTH_TENANT_ID
is set, it's the tenant that hosts the app.Redeploy the solution with the following command.
azd up