User your own data with Azure Open AI - How can we implement document level security?

Question

Dear Team,

Currently we have a bot built in Azure Open AI with Organization Data and it uses Azure Blob Storage as our data source reference.

As we are adding more and more documents to feed our bot, we need to control its responses based on the user who is asking the question. The goal is to have the response to be limited to the document that the user has access to. Is there a simpler way to do this? I am looking into document level security article but there was no step-by-step guidance on how to implement this.

Would appreciate if anyone can suggest a way to do this without having to create multiple agents or bots.

Thank you so much.

Answer

Start by creating the roles in AAD with specific permissions for each document or set of documents stored in Blob Storage, then assign them to the users who need access to the respective documents.

Organize your Blob Storage such that each document or set of documents is in a distinct container or directory that matches your access control needs.

Now comes the step to implement the access control in the chatbot :

Your bot should authenticate users using Azure AD. You can use OAuth 2.0 or another authentication mechanism supported by Azure AD.
Once authenticated, retrieve the user roles and permissions using Azure AD Graph API or Microsoft Graph API.
Maintain a mapping between roles and document paths in your Blob Storage. and you can store it in a secure database or configuration file.

For the query handling based on user permissions :

Pre-process User Query: When a user asks a question, pre-process the query to identify the relevant documents the user has access to.
Filter Documents: Based on the user's roles and the mapping created, filter the documents that the user is allowed to access.
Search within Permitted Documents: Perform the search or response generation only within the subset of documents the user has access to.

Here is how you can implement it :


from azure.identity import DefaultAzureCredential

from azure.storage.blob import BlobServiceClient

from azure.ai.openai import OpenAIClient

# Initialize Azure Blob Storage and OpenAI Client

credential = DefaultAzureCredential()

blob_service_client = BlobServiceClient(account_url="https://.blob.core.windows.net", credential=credential)

openai_client = OpenAIClient(endpoint="https://.openai.azure.com", credential=credential)

def get_user_permissions(user_id):

    # Fetch user roles from Azure AD

    # Implement the logic to fetch user roles

    user_roles = fetch_user_roles_from_aad(user_id)

    return user_roles

def filter_documents_based_on_permissions(user_roles):

    # Map user roles to document paths

    role_to_docs_map = {

        "role1": ["doc_path1", "doc_path2"],

        "role2": ["doc_path3", "doc_path4"],

    }

    accessible_docs = []

    for role in user_roles:

        accessible_docs.extend(role_to_docs_map.get(role, []))

    return accessible_docs

def search_documents(user_query, accessible_docs):

    results = []

    for doc_path in accessible_docs:

        blob_client = blob_service_client.get_blob_client(container="", blob=doc_path)

        blob_data = blob_client.download_blob().readall()

        # Implement your logic to search within the document

        if user_query in blob_data:

            results.append(blob_data)

    return results

def generate_response(user_query, user_id):

    user_roles = get_user_permissions(user_id)

    accessible_docs = filter_documents_based_on_permissions(user_roles)

    search_results = search_documents(user_query, accessible_docs)

    # Use OpenAI API to generate response based on search results

    response = openai_client.generate_response(search_results)

    return response

# Example usage

user_query = "What is the company's policy on data security?"

user_id = ""

response = generate_response(user_query, user_id)

print(response)

Share via

User your own data with Azure Open AI - How can we implement document level security?

1 answer

Your answer