User your own data with Azure Open AI - How can we implement document level security?

Abigail Joyce Cuadra 20 Reputation points
2024-07-29T12:54:42.67+00:00

Dear Team,

Currently we have a bot built in Azure Open AI with Organization Data and it uses Azure Blob Storage as our data source reference.

As we are adding more and more documents to feed our bot, we need to control its responses based on the user who is asking the question. The goal is to have the response to be limited to the document that the user has access to. Is there a simpler way to do this? I am looking into document level security article but there was no step-by-step guidance on how to implement this.

Would appreciate if anyone can suggest a way to do this without having to create multiple agents or bots.

Thank you so much.

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,855 questions
Azure Role-based access control
Azure Role-based access control
An Azure service that provides fine-grained access management for Azure resources, enabling you to grant users only the rights they need to perform their jobs.
809 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,066 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,847 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 24,556 Reputation points
    2024-07-30T09:41:15.8166667+00:00

    Start by creating the roles in AAD with specific permissions for each document or set of documents stored in Blob Storage, then assign them to the users who need access to the respective documents.

    Organize your Blob Storage such that each document or set of documents is in a distinct container or directory that matches your access control needs.

    Now comes the step to implement the access control in the chatbot :

    1. Your bot should authenticate users using Azure AD. You can use OAuth 2.0 or another authentication mechanism supported by Azure AD.
    2. Once authenticated, retrieve the user roles and permissions using Azure AD Graph API or Microsoft Graph API.
    3. Maintain a mapping between roles and document paths in your Blob Storage. and you can store it in a secure database or configuration file.

    For the query handling based on user permissions :

    1. Pre-process User Query: When a user asks a question, pre-process the query to identify the relevant documents the user has access to.
    2. Filter Documents: Based on the user's roles and the mapping created, filter the documents that the user is allowed to access.
    3. Search within Permitted Documents: Perform the search or response generation only within the subset of documents the user has access to.

    Here is how you can implement it :

    
    from azure.identity import DefaultAzureCredential
    
    from azure.storage.blob import BlobServiceClient
    
    from azure.ai.openai import OpenAIClient
    
    # Initialize Azure Blob Storage and OpenAI Client
    
    credential = DefaultAzureCredential()
    
    blob_service_client = BlobServiceClient(account_url="https://<your_storage_account>.blob.core.windows.net", credential=credential)
    
    openai_client = OpenAIClient(endpoint="https://<your_openai_endpoint>.openai.azure.com", credential=credential)
    
    def get_user_permissions(user_id):
    
        # Fetch user roles from Azure AD
    
        # Implement the logic to fetch user roles
    
        user_roles = fetch_user_roles_from_aad(user_id)
    
        return user_roles
    
    def filter_documents_based_on_permissions(user_roles):
    
        # Map user roles to document paths
    
        role_to_docs_map = {
    
            "role1": ["doc_path1", "doc_path2"],
    
            "role2": ["doc_path3", "doc_path4"],
    
        }
    
        accessible_docs = []
    
        for role in user_roles:
    
            accessible_docs.extend(role_to_docs_map.get(role, []))
    
        return accessible_docs
    
    def search_documents(user_query, accessible_docs):
    
        results = []
    
        for doc_path in accessible_docs:
    
            blob_client = blob_service_client.get_blob_client(container="<your_container>", blob=doc_path)
    
            blob_data = blob_client.download_blob().readall()
    
            # Implement your logic to search within the document
    
            if user_query in blob_data:
    
                results.append(blob_data)
    
        return results
    
    def generate_response(user_query, user_id):
    
        user_roles = get_user_permissions(user_id)
    
        accessible_docs = filter_documents_based_on_permissions(user_roles)
    
        search_results = search_documents(user_query, accessible_docs)
    
        # Use OpenAI API to generate response based on search results
    
        response = openai_client.generate_response(search_results)
    
        return response
    
    # Example usage
    
    user_query = "What is the company's policy on data security?"
    
    user_id = "<user_id>"
    
    response = generate_response(user_query, user_id)
    
    print(response)
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.