How do you use RAG in the REST API?

Bao, Jeremy (Cognizant) 80 Reputation points

I am having some trouble figuring out how to perform RAG using the REST API for Azure OpenAI. I saw this link:, but am confused about differences between this interface and the Python SDK. I currently have this configuration stuff for "data_sources" in my request body. It is probably wrong, though no errors are being returned.

"data_sources": [
        "type": "AzureCognitiveSearch",
        "parameters": {
            "endpoint": ...,
            "key": "{{AZURE_SEARCH_KEY}}",
            "index_name": ...,
            "vector_profile": ...,
            "vector_field": ...

authentication is listed as required, but I do not have it, and it seems to be working? I just have the key for the Azure AI Search Service here, and include the key for the Azure OpenAI Service in the header. Do I need to replace key with some ApiKeyAuthenticationOptions object there? Should it be like:

"authentication": {
    "key": "{{AZURE_SEARCH_KEY}}",
    "type": "api_key"


Also, am I supposed to use some embedding_dependency field instead of vector_profile and vector_field? What is the difference between DeploymentNameVectorizationSource and EndpointVectorizationSource? Just what is contained within them? If both the GPT deployment and the embedding deployment are in the same Azure OpenAI service, I can just use a DeploymentNameVectorizationSource, right?

Finally, what is the difference between the role_information field and the first message with "role": "system" in the array of messages?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
689 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,109 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. brtrach-MSFT 15,166 Reputation points Microsoft Employee

    @Bao, Jeremy (Cognizant) To use RAG with the REST API for Azure OpenAI, you can follow the steps below:

    1. First, you need to create a deployment for your RAG model in the Azure OpenAI service. You can do this using the Azure portal or the Azure CLI. Once you have created the deployment, you can get the deployment ID, which you will need to use in your API requests.
    2. To use RAG with the REST API, you need to send a POST request to the /openai/v1.0/generations endpoint. In the request body, you need to include the prompt for the RAG model, as well as any additional parameters you want to use. Here is an example request body:
             "model": "text-davinci-002",
             "prompt": "What is the capital of France?",
             "temperature": 0.5,
             "max_tokens": 50,
             "top_p": 1,
             "frequency_penalty": 0,
             "presence_penalty": 0,
             "stop": "\n"
      In this example, we are using the text-davinci-002 model, and asking the model to generate a response to the prompt "What is the capital of France?" We are also setting some additional parameters, such as the temperature, max_tokens, top_p, frequency_penalty, and presence_penalty.
    3. In the request header, you need to include your Azure OpenAI API key. Here is an example header:
             "Content-Type": "application/json",
             "Authorization": "Bearer <your-api-key>"
      Replace &lt;your-api-key&gt; with your actual API key.
    4. Send the POST request to the /openai/v1.0/generations endpoint, and you should receive a response from the RAG model.

    Regarding your questions about the request body, the data_sources field is not needed for RAG. The vector_profile and vector_field fields are also not needed for RAG. Instead, you should include the model field in the request body, which specifies the name of the RAG model you want to use.

    When it comes to authentication, you should include your Azure OpenAI API key in the Authorization header of your API request. The authentication field is not needed for RAG.

    For the role_information field, this field is used to specify the role of each message in a conversation. The first message in the array should have a role of "system", which indicates that it is a system message. The subsequent messages should have a role of "user", which indicates that they are user messages.

    Lastly, for DeploymentNameVectorizationSource and EndpointVectorizationSource fields, these fields are used to specify the source of the embeddings used by the RAG model. If your RAG model and your embedding model are both deployed in the same Azure OpenAI service, you can use the DeploymentNameVectorizationSource field to specify the name of the deployment for the embedding model. If your RAG model and your embedding model are deployed in different services, you can use the EndpointVectorizationSource field to specify the endpoint for the embedding service.