Using RAG (Retrieval-Augmented Generation) in an Azure AI Pipeline over a Private Network

Question

Using RAG (Retrieval-Augmented Generation) in an Azure AI Pipeline over a Private Network

ilyes sais 0

I want to deploy a RAG flow in Azure AI Pipeline entirely privately, without exposing the OpenAI endpoint to the public Internet. My plan is to:

Create an Azure Cognitive Search index behind a Private Endpoint in a VNet.

Deploy my OpenAI model (GPT-3.5/GPT-4) also behind a Private Endpoint in the same VNet.

Configure the RAG pipeline so it retrieves documents from the private index and calls the model via the private endpoint.

from azure.ai.openai import OpenAIClient

from azure.identity import DefaultAzureCredential

endpoint = "https://<my-private-openai>.openai.azure.com/"

client = OpenAIClient(endpoint=endpoint, credential=DefaultAzureCredential())

response = client.get_models() # results in timeout or 403

print(response)

What am I missing to run a fully private RAG pipeline?

Is there a special setting in the pipeline definition to enforce use of the Private Endpoint?

Are there any known limitations with RAG over a private network in Azure AI Pipeline?

1 réponse

Votre réponse

Answer 1

Hi ilyes sais

To deploy a fully private Retrieval-Augmented Generation (RAG) pipeline in Azure AI Pipeline, ensuring that both Azure Cognitive Search and Azure OpenAI remain behind private endpoints, you need to verify several key configurations. First, ensure that your Azure OpenAI deployment is correctly configured with a Private Endpoint and that the necessary Network Security Group (NSG) rules allow communication within the Virtual Network (VNet). Additionally, confirm that your Azure Cognitive Search index is accessible via its Private Endpoint and that DNS resolution is correctly set up using Azure Private DNS to resolve the private endpoint addresses.

Regarding the timeout or 403 error when calling client.get_models(), this typically occurs due to missing role assignments or improper network configurations. Ensure that the Managed Identity associated with your Azure AI Pipeline has the necessary permissions to access Azure OpenAI. You may need to explicitly grant Cognitive Services OpenAI User role to the identity. Also, verify that your private endpoint setup allows outbound access to Azure OpenAI services, as some configurations may block outbound traffic by default.

To enforce the use of Private Endpoints in your pipeline definition, you should configure the workspace networking settings to restrict outbound traffic to only approved destinations. In Azure Machine Learning, this can be done by enabling Workspace Managed Virtual Network and defining user-defined outbound rules to allow communication with Azure OpenAI and Azure Cognitive Search. Additionally, ensure that your pipeline components reference the private endpoint URLs explicitly.

There are known limitations when running RAG over a private network in Azure AI Pipeline. One key challenge is ensuring that all dependencies, including vector indexing and model inference, operate within the private network without requiring public internet access. Some Azure services may still require outbound connectivity for metadata retrieval or logging, which must be accounted for in firewall rules. Additionally, latency considerations arise when using private endpoints, as network routing may introduce slight delays compared to public endpoints.

For a detailed guide on securing RAG workflows with private networking, refer to Microsoft's documentation.
Thanks

Ravada Shivaprasad 630 Points de réputation Personnel externe Microsoft Modérateur

2025-05-27T23:02:21.7066667+00:00

Hi ilyes sais

Just checking in to see if the above answer helped. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thanks
Ravada Shivaprasad 630 Points de réputation Personnel externe Microsoft Modérateur

2025-05-28T20:08:55.7633333+00:00

Hi ilyes sais

Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thanks

Partager via

Using RAG (Retrieval-Augmented Generation) in an Azure AI Pipeline over a Private Network

1 réponse

Votre réponse