Azure OpenAI Service REST API preview reference

Artikkel
10/16/2024

This article provides details on the inference REST API endpoints for Azure OpenAI.

API specs

Managing and interacting with Azure OpenAI models and resources is divided across three primary API surfaces:

Control plane
Data plane - authoring
Data plane - inference

Each API surface/specification encapsulates a different set of Azure OpenAI capabilities. Each API has its own unique set of preview and stable/generally available (GA) API releases. Preview releases currently tend to follow a monthly cadence.

API	Latest preview release	Latest GA release	Specifications	Description
Control plane	`2024-06-01-preview`	`2024-10-01`	Spec files	Azure OpenAI shares a common control plane with all other Azure AI Services. The control plane API is used for things like creating Azure OpenAI resources, model deployment, and other higher level resource management tasks. The control plane also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI.
Data plane - authoring	`2024-10-01-preview`	`2024-10-21`	Spec files	The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, batch and certain model level queries
Data plane - inference	`2024-10-01-preview`	`2024-10-21`	Spec files	The data plane inference API provides the inference capabilities/endpoints for features like completions, chat completions, embeddings, speech/whisper, on your data, Dall-e, assistants, etc.

Authentication

Azure OpenAI provides two methods for authentication. You can use either API Keys or Microsoft Entra ID.

API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.
Microsoft Entra ID authentication: You can authenticate an API call using a Microsoft Entra token. Authentication tokens are included in a request as the Authorization header. The token provided must be preceded by Bearer, for example Bearer YOUR_AUTH_TOKEN. You can read our how-to guide on authenticating with Microsoft Entra ID.

REST API versioning

The service APIs are versioned using the api-version query parameter. All versions follow the YYYY-MM-DD date structure. For example:

POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

Data plane inference

The rest of the article covers the latest preview release of the Azure OpenAI data plane inference specification, 2024-10-01-preview. This article includes documentation for the latest preview capabilities like assistants, threads, and vector stores.

If you're looking for documentation on the latest GA API release, refer to the latest GA data plane inference API

Completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-10-01-preview

Creates a completion for the provided prompt, parameters and chosen model.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the model which was deployed.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
prompt	string or array	The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <\|endoftext\|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.	Yes
best_of	integer	Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed. When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return â€“ `best_of` must be greater than `n`. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
echo	boolean	Echo back the prompt in addition to the completion	No	False
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <\|endoftext\|> token from being generated.	No	None
logprobs	integer	Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the five most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response. The maximum value for `logprobs` is 5.	No	None
max_tokens	integer	The maximum number of tokens that can be generated in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length.	No	16
n	integer	How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
seed	integer	If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to four sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.	No
stream	boolean	Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
suffix	string	The suffix that comes after a completion of inserted text. This parameter is only supported for `gpt-3.5-turbo-instruct`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	createCompletionResponse	Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2024-10-01-preview

{
 "prompt": [
  "tell me a joke about mango"
 ],
 "max_tokens": 32,
 "temperature": 1.0,
 "n": 1
}

Responses: Status Code: 200

{
  "body": {
    "id": "cmpl-7QmVI15qgYVllxK0FtxVGG6ywfzaq",
    "created": 1686617332,
    "choices": [
      {
        "text": "es\n\nWhat do you call a mango who's in charge?\n\nThe head mango.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": null
      }
    ],
    "usage": {
      "completion_tokens": 20,
      "prompt_tokens": 6,
      "total_tokens": 26
    }
  }
}

Embeddings - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-10-01-preview

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
input	string or array	Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8,192 tokens for `text-embedding-ada-002`), can't be an empty string, and any array must be 2,048 dimensions or less.	Yes
user	string	A unique identifier representing your end-user, which can help monitoring and detecting abuse.	No
input_type	string	input type of embedding search to use	No
encoding_format	string	The format to return the embeddings in. Can be either `float` or `base64`. Defaults to `float`.	No
dimensions	integer	The number of dimensions the resulting output embeddings should have. Only supported in `text-embedding-3` and later models.	No

Responses

Name	Type	Required
object	string	Yes
model	string	Yes
data	array	Yes
usage	object	Yes

Properties for usage

prompt_tokens

Name	Type	Description	Default
prompt_tokens	integer

total_tokens

Name	Type	Description	Default
total_tokens	integer

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	object

Examples

Example

Return the embeddings for a given prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2024-10-01-preview

{
 "input": [
  "this is a test"
 ]
}

Responses: Status Code: 200

{
  "body": {
    "data": [
      {
        "index": 0,
        "embedding": [
          -0.012838088,
          -0.007421397,
          -0.017617522,
          -0.028278312,
          -0.018666342,
          0.01737855,
          -0.01821495,
          -0.006950092,
          -0.009937238,
          -0.038580645,
          0.010674067,
          0.02412286,
          -0.013647936,
          0.013189907,
          0.0021125758,
          0.012406612,
          0.020790534,
          0.00074595667,
          0.008397198,
          -0.00535031,
          0.008968075,
          0.014351576,
          -0.014086051,
          0.015055214,
          -0.022211088,
          -0.025198232,
          0.0065186154,
          -0.036350243,
          0.009180495,
          -0.009698266,
          0.009446018,
          -0.008463579,
          -0.0040426035,
          -0.03443847,
          -0.00091273896,
          -0.0019217303,
          0.002349888,
          -0.021560553,
          0.016515596,
          -0.015572986,
          0.0038666942,
          -8.432463e-05,
          0.0032178196
        ]
      }
    ],
    "usage": {
      "prompt_tokens": 4,
      "total_tokens": 4
    }
  }
}

Chat completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

Creates a completion for the chat message

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the model which was deployed.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
stop	string or array	Up to four sequences where the API will stop generating further tokens.	No
max_tokens	integer	The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.	No
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No	None
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
messages	array	A list of messages comprising the conversation so far.	Yes
data_sources	array	The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.	No
logprobs	boolean	Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.	No	False
top_logprobs	integer	An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.	No
n	integer	How many chat completion choices to generate for each input message. Note that you'll be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.	No	1
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema	An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which guarantees the model will match your supplied JSON schema. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
seed	integer	This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
tools	array	A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.	No
tool_choice	chatCompletionToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.	No
function_call	string or chatCompletionFunctionCallOption	Deprecated in favor of `tool_choice`. Controls which (if any) function is called by the model. `none` means the model won't call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present.	No
functions	array	Deprecated in favor of `tools`. A list of functions the model may generate JSON inputs for.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	createChatCompletionResponse or createChatCompletionStreamResponse

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "system",
   "content": "you are a helpful assistant that talks like a pirate"
  },
  {
   "role": "user",
   "content": "can you tell me how to care for a parrot?"
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Ahoy matey! So ye be wantin' to care for a fine squawkin' parrot, eh? Well, shiver me timbers, let ol' Cap'n Assistant share some wisdom with ye! Here be the steps to keepin' yer parrot happy 'n healthy:\n\n1. Secure a sturdy cage: Yer parrot be needin' a comfortable place to lay anchor! Be sure ye get a sturdy cage, at least double the size of the bird's wingspan, with enough space to spread their wings, yarrrr!\n\n2. Perches 'n toys: Aye, parrots need perches of different sizes, shapes, 'n textures to keep their feet healthy. Also, a few toys be helpin' to keep them entertained 'n their minds stimulated, arrrh!\n\n3. Proper grub: Feed yer feathered friend a balanced diet of high-quality pellets, fruits, 'n veggies to keep 'em strong 'n healthy. Give 'em fresh water every day, or ye\u00e2\u20ac\u2122ll have a scurvy bird on yer hands!\n\n4. Cleanliness: Swab their cage deck! Clean their cage on a regular basis: fresh water 'n food daily, the floor every couple of days, 'n a thorough scrubbing ev'ry few weeks, so the bird be livin' in a tidy haven, arrhh!\n\n5. Socialize 'n train: Parrots be a sociable lot, arrr! Exercise 'n interact with 'em daily to create a bond 'n maintain their mental 'n physical health. Train 'em with positive reinforcement, treat 'em kindly, yarrr!\n\n6. Proper rest: Yer parrot be needin' \u00e2\u20ac\u2122bout 10-12 hours o' sleep each night. Cover their cage 'n let them slumber in a dim, quiet quarter for a proper night's rest, ye scallywag!\n\n7. Keep a weather eye open for illness: Birds be hidin' their ailments, arrr! Be watchful for signs of sickness, such as lethargy, loss of appetite, puffin' up, or change in droppings, and make haste to a vet if need be.\n\n8. Provide fresh air 'n avoid toxins: Parrots be sensitive to draft and pollutants. Keep yer quarters well ventilated, but no drafts, arrr! Be mindful of toxins like Teflon fumes, candles, or air fresheners.\n\nSo there ye have it, me hearty! With proper care 'n commitment, yer parrot will be squawkin' \"Yo-ho-ho\" for many years to come! Good luck, sailor, and may the wind be at yer back!"
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search data and system-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "authentication": {
     "type": "system_assigned_managed_identity"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search image vector data.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "query_type": "vector",
    "fields_mapping": {
     "image_vector_fields": [
      "image_vector"
     ]
    },
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion."
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a cat?"
  },
  {
   "role": "assistant",
   "content": "Content of the completion [doc1].",
   "context": {
    "intent": "cat care"
   }
  },
  {
   "role": "user",
   "content": "how about dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "authentication": {
     "type": "user_assigned_managed_identity",
     "managed_identity_resource_id": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{resource-name}"
    },
    "index_name": "{index name}",
    "query_type": "vector",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "in_scope": true,
    "top_n_documents": 5,
    "strictness": 3,
    "role_information": "You are an AI assistant that helps people find information.",
    "fields_mapping": {
     "content_fields_separator": "\\n",
     "content_fields": [
      "content"
     ],
     "filepath_field": "filepath",
     "title_field": "title",
     "url_field": "url",
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content 2.",
                "title": "Citation Title 2",
                "filepath": "contoso2.txt",
                "url": "https://contoso.blob.windows.net/container/contoso2.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Azure Cosmos DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_cosmos_db",
   "parameters": {
    "authentication": {
     "type": "connection_string",
     "connection_string": "mongodb+srv://rawantest:{password}$@{cluster-name}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
    },
    "database_name": "vectordb",
    "container_name": "azuredocs",
    "index_name": "azuredocindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Mongo DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "mongo_db",
   "parameters": {
    "authentication": {
     "type": "username_and_password",
     "username": "<username>",
     "password": "<password>"
    },
    "endpoint": "<endpoint_name>",
    "app_name": "<application name>",
    "database_name": "sampledb",
    "collection_name": "samplecollection",
    "index_name": "sampleindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Elasticsearch.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "elasticsearch",
   "parameters": {
    "endpoint": "https://your-elasticsearch-endpoint.eastus.azurecontainer.io",
    "index_name": "{index name}",
    "authentication": {
     "type": "key_and_key_id",
     "key": "{key}",
     "key_id": "{key id}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Pinecone resource.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2024-10-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "pinecone",
   "parameters": {
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    },
    "environment": "{environment name}",
    "index_name": "{index name}",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "title_field": "title",
     "url_field": "url",
     "filepath_field": "filepath",
     "content_fields": [
      "content"
     ],
     "content_fields_separator": "\n"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Transcriptions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-10-01-preview

Transcribes audio into the input language.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the whisper model.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
file	string	The audio file object to transcribe.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0
language	string	The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.	No
timestamp_granularities[]	array	The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There's no additional latency for segment timestamps, but generating word timestamps incurs additional latency.	No	['segment']

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	audioResponse or audioVerboseResponse
text/plain	string	Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2024-10-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Translations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-10-01-preview

Transcribes and translates input audio into English text.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the whisper model which was deployed.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
file	string	The audio file to translate.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	audioResponse or audioVerboseResponse
text/plain	string	Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-10-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2024-10-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Speech - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-10-01-preview

Generates audio from the input text.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the tts model which was deployed.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
input	string	The text to synthesize audio for. The maximum length is 4,096 characters.	Yes
voice	enum	The voice to use for speech synthesis. Possible values: alloy, echo, fable, onyx, nova, shimmer	Yes
response_format	enum	The format to synthesize the audio in. Possible values: mp3, opus, aac, flac, wav, pcm	No
speed	number	The speed of the synthesized audio. Select a value from `0.25` to `4.0`. `1.0` is the default.	No	1.0

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/octet-stream	string

Examples

Example

Synthesizes audio from the provided text.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2024-10-01-preview

{
 "input": "Hi! What are you going to make?",
 "voice": "fable",
 "response_format": "mp3"
}

Responses: Status Code: 200

{
  "body": "101010101"
}

Image generations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-10-01-preview

Generates a batch of images from a text caption on a given DALLE model deployment

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	Deployment id of the dalle model which was deployed.
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
prompt	string	A text description of the desired image(s). The maximum length is 4,000 characters.	Yes
n	integer	The number of images to generate.	No	1
size	imageSize	The size of the generated images.	No	1024x1024
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
quality	imageQuality	The quality of the image that will be generated.	No	standard
style	imageStyle	The style of the generated images.	No	vivid

Responses

Status Code: 200

Description: Ok

Content-Type	Type	Description
application/json	generateImagesResponse

Status Code: default

Description: An error occurred.

Content-Type	Type	Description
application/json	dalleErrorResponse

Examples

Example

Creates images given a prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2024-10-01-preview

{
 "prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
 "n": 1,
 "style": "natural",
 "quality": "standard"
}

Responses: Status Code: 200

{
  "body": {
    "created": 1698342300,
    "data": [
      {
        "revised_prompt": "A vivid, natural representation of Microsoft Clippy wearing a cowboy hat.",
        "prompt_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          },
          "profanity": {
            "detected": false,
            "filtered": false
          },
          "custom_blocklists": {
            "filtered": false,
            "details": []
          }
        },
        "url": "https://dalletipusw2.blob.core.windows.net/private/images/e5451cc6-b1ad-4747-bd46-b89a3a3b8bc3/generated_00.png?se=2023-10-27T17%3A45%3A09Z&...",
        "content_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          }
        }
      }
    ]
  }
}

List - Assistants

GET https://{endpoint}/openai/assistants?api-version=2024-10-01-preview

Returns a list of assistants.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listAssistantsResponse

Examples

Example

Returns a list of assistants.

GET https://{endpoint}/openai/assistants?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "asst_abc123",
        "object": "assistant",
        "created_at": 1707257477,
        "name": "Stock Analyst",
        "description": null,
        "model": "gpt-4-1106-preview",
        "instructions": "You are a financial analyst that analyzes stock market prices and other financial data present on user uploaded files or by calling external APIs.",
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc456",
        "object": "assistant",
        "created_at": 1698982718,
        "name": "My Assistant",
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": "You are a helpful assistant designed to make me better at coding!",
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc789",
        "object": "assistant",
        "created_at": 1698982643,
        "name": null,
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      }
    ],
    "first_id": "asst_abc123",
    "last_id": "asst_abc789",
    "has_more": false
  }
}

Create - Assistant

POST https://{endpoint}/openai/assistants?api-version=2024-10-01-preview

Create an assistant with a model and instructions.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
model			Yes
name	string	The name of the assistant. The maximum length is 256 characters.	No
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.
vector_stores	array	A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject	Represents an `assistant` that can call the model and use tools.

Examples

Example

Create an assistant with a model and instructions.

POST https://{endpoint}/openai/assistants?api-version=2024-10-01-preview

{
 "name": "Math Tutor",
 "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
 "tools": [
  {
   "type": "code_interpreter"
  }
 ],
 "model": "gpt-4-1106-preview"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant",
    "created_at": 1707295707,
    "name": "Math Tutor",
    "description": null,
    "model": "gpt-4-1106-preview",
    "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Get - Assistant

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

Retrieves an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject	Represents an `assistant` that can call the model and use tools.

Examples

Example

Retrieves an assistant.

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_abc123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Modify - Assistant

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

Modifies an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
model			No
name	string	The name of the assistant. The maximum length is 256 characters.	No
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 32,768 characters.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	Overrides the list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	Overrides the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject	Represents an `assistant` that can call the model and use tools.

Examples

Example

Modifies an assistant.

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

{
 "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
 "tools": [
  {
   "type": "file_search"
  }
 ],
 "model": "gpt-4-turbo"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": []
      }
    },
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Delete - Assistant

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

Delete an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteAssistantResponse

Examples

Example

Deletes an assistant.

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant.deleted",
    "deleted": true
  }
}

Create - Thread

POST https://{endpoint}/openai/threads?api-version=2024-10-01-preview

Create a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
messages	array	A list of messages to start the thread with.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
vector_stores	array	A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of one vector store attached to the thread.

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject	Represents a thread that contains messages.

Examples

Example

Creates a thread.

POST https://{endpoint}/openai/threads?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {}
  }
}

Get - Thread

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

Retrieves a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject	Represents a thread that contains messages.

Examples

Example

Retrieves a thread.

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {},
    "tool_resources": {
      "code_interpreter": {
        "file_ids": []
      }
    }
  }
}

Modify - Thread

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

Modifies a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of File IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this thread. There can be a maximum of one vector store attached to the thread.

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject	Represents a thread that contains messages.

Examples

Example

Modifies a thread.

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {
      "modified": "true",
      "user": "abc123"
    },
    "tool_resources": {}
  }
}

Delete - Thread

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

Delete a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteThreadResponse

Examples

Example

Deletes a thread.

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread.deleted",
    "deleted": true
  }
}

List - Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-10-01-preview

Returns a list of messages for a given thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
run_id	query	No	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listMessagesResponse

Examples

Example

List Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "msg_abc123",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "How does AI work? Explain it in simple terms.",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      },
      {
        "id": "msg_abc456",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "Hello, what is AI?",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      }
    ],
    "first_id": "msg_abc123",
    "last_id": "msg_abc456",
    "has_more": false
  }
}

Create - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-10-01-preview

Create a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
role	string	The role of the entity that is creating the message. Allowed values include: - `user`: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. - `assistant`: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.	Yes
content	string	The content of the message.	Yes
attachments	array	A list of files attached to the message, and the tools they should be added to.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject	Represents a message within a thread.

Examples

Example

Create a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2024-10-01-preview

{
 "role": "user",
 "content": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces."
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "assistant_id": null,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "attachments": [],
    "metadata": {}
  }
}

Get - Message

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-10-01-preview

Retrieve a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
message_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject	Represents a message within a thread.

Examples

Example

Retrieve a message.

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "assistant_id": null,
    "run_id": null,
    "metadata": {}
  }
}

Modify - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-10-01-preview

Modifies a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
message_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject	Represents a message within a thread.

Examples

Example

Modify a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2024-10-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_abc123",
    "object": "thread.message",
    "created_at": 1699017614,
    "assistant_id": null,
    "thread_id": "thread_abc123",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "How does AI work? Explain it in simple terms.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "metadata": {
      "modified": "true",
      "user": "abc123"
    }
  }
}

Create - Thread And Run

POST https://{endpoint}/openai/threads/runs?api-version=2024-10-01-preview

Create a thread and run it in one request.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
thread	createThreadRequest		No
model	string	The deployment name of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of one vector store attached to the assistant.

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

Create a thread and run it in one request.

POST https://{endpoint}/openai/threads/runs?api-version=2024-10-01-preview

{
 "assistant_id": "asst_abc123",
 "thread": {
  "messages": [
   {
    "role": "user",
    "content": "Explain deep learning to a 5 year old."
   }
  ]
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076792,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": null,
    "expires_at": 1699077392,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "required_action": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You are a helpful assistant.",
    "tools": [],
    "tool_resources": {},
    "metadata": {},
    "temperature": 1.0,
    "top_p": 1.0,
    "max_completion_tokens": null,
    "max_prompt_tokens": null,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "incomplete_details": null,
    "usage": null,
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

List - Runs

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-10-01-preview

Returns a list of runs belonging to a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listRunsResponse

Examples

Example

Returns a list of runs belonging to a thread.

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "run_abc123",
        "object": "thread.run",
        "created_at": 1699075072,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699075072,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699075073,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      },
      {
        "id": "run_abc456",
        "object": "thread.run",
        "created_at": 1699063290,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699063290,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699063291,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      }
    ],
    "first_id": "run_abc123",
    "last_id": "run_abc456",
    "has_more": false
  }
}

Create - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-10-01-preview

Create a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
model	string	The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
additional_instructions	string	Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.	No
additional_messages	array	Adds additional messages to the thread before creating the run.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

Create a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2024-10-01-preview

{
 "assistant_id": "asst_abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699063290,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": 1699063290,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699063291,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Get - Run

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-10-01-preview

Retrieves a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

Gets a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_HsO8tYM4K5AAMAHgK0J3om8Q",
    "object": "thread.run",
    "created_at": 1707303196,
    "assistant_id": "asst_JtTwHk28cIocgFXZPCBxhOzl",
    "thread_id": "thread_eRNwflE3ncDYak1np6MdMHJh",
    "status": "completed",
    "started_at": 1707303197,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1707303201,
    "last_error": null,
    "model": "gpt-4-1106-preview",
    "instructions": "You are an AI model that empowers every person and every organization on the planet to achieve more.",
    "tools": [],
    "file_ids": [],
    "metadata": {}
  }
}

Modify - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-10-01-preview

Modifies a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

Modifies a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2024-10-01-preview

{
 "metadata": {
  "user_id": "user_abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699075072,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "completed",
    "started_at": 1699075072,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699075073,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "tool_resources": {
      "code_interpreter": {
        "file_ids": [
          "file-abc123",
          "file-abc456"
        ]
      }
    },
    "metadata": {
      "user_id": "user_abc123"
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    },
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Submit - Tool Outputs To Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-10-01-preview

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
tool_outputs	array	A list of tools for which the outputs are being submitted.	Yes
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2024-10-01-preview

{
 "tool_outputs": [
  {
   "tool_call_id": "call_001",
   "output": "70 degrees and sunny."
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_123",
    "object": "thread.run",
    "created_at": 1699075592,
    "assistant_id": "asst_123",
    "thread_id": "thread_123",
    "status": "queued",
    "started_at": 1699075592,
    "expires_at": 1699076192,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": [
                  "celsius",
                  "fahrenheit"
                ]
              }
            },
            "required": [
              "location"
            ]
          }
        }
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Cancel - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-10-01-preview

Cancels a run that is in_progress.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject	Represents an execution run on a thread.

Examples

Example

Cancels a run that is in_progress.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076126,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "cancelling",
    "started_at": 1699076126,
    "expires_at": 1699076726,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You summarize books.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": [
          "vs_123"
        ]
      }
    },
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "response_format": "auto"
  }
}

List - Run Steps

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-10-01-preview

Returns a list of run steps belonging to a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listRunStepsResponse

Examples

Example

Returns a list of run steps belonging to a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "step_abc123",
        "object": "thread.run.step",
        "created_at": 1699063291,
        "run_id": "run_abc123",
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "type": "message_creation",
        "status": "completed",
        "cancelled_at": null,
        "completed_at": 1699063291,
        "expired_at": null,
        "failed_at": null,
        "last_error": null,
        "step_details": {
          "type": "message_creation",
          "message_creation": {
            "message_id": "msg_abc123"
          }
        },
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        }
      }
    ],
    "first_id": "step_abc123",
    "last_id": "step_abc456",
    "has_more": false
  }
}

Get - Run Step

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-10-01-preview

Retrieves a run step.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string
run_id	path	Yes	string
step_id	path	Yes	string
api-version	query	Yes	string	API version

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runStepObject	Represents a step in execution of a run.

Examples

Example

Retrieves a run step.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "step_abc123",
    "object": "thread.run.step",
    "created_at": 1699063291,
    "run_id": "run_abc123",
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "type": "message_creation",
    "status": "completed",
    "cancelled_at": null,
    "completed_at": 1699063291,
    "expired_at": null,
    "failed_at": null,
    "last_error": null,
    "step_details": {
      "type": "message_creation",
      "message_creation": {
        "message_id": "msg_abc123"
      }
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    }
  }
}

List - Vector Stores

GET https://{endpoint}/openai/vector_stores?api-version=2024-10-01-preview

Returns a list of vector stores.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoresResponse

Examples

Example

Returns a list of vector stores.

GET https://{endpoint}/openai/vector_stores?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "vs_abc123",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      },
      {
        "id": "vs_abc456",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ v2",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      }
    ],
    "first_id": "vs_abc123",
    "last_id": "vs_abc456",
    "has_more": false
  }
}

Create - Vector Store

POST https://{endpoint}/openai/vector_stores?api-version=2024-10-01-preview

Create a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
file_ids	array	A list of file IDs that the vector store should use. Useful for tools like `file_search` that can access files.	No
name	string	The name of the vector store.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is nonempty.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject	A vector store is a collection of processed files can be used by the `file_search` tool.

Examples

Example

Creates a vector store.

POST https://{endpoint}/openai/vector_stores?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Get - Vector Store

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

Retrieves a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject	A vector store is a collection of processed files can be used by the `file_search` tool.

Examples

Example

Retrieves a vector store.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776
  }
}

Modify - Vector Store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

Modifies a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
name	string	The name of the vector store.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject	A vector store is a collection of processed files can be used by the `file_search` tool.

Examples

Example

Modifies a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

{
 "name": "Support FAQ"
}

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Delete - Vector Store

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

Delete a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteVectorStoreResponse

Examples

Example

Deletes a vector store.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store.deleted",
    "deleted": true
  }
}

List - Vector Store Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-10-01-preview

Returns a list of vector store files.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
filter	query	No	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Create - Vector Store File

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-10-01-preview

Create a vector store file by attaching a File to a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
file_id	string	A File ID that the vector store should use. Useful for tools like `file_search` that can access files.	Yes
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileObject	A list of files attached to a vector store.

Examples

Example

Create a vector store file by attaching a File to a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2024-10-01-preview

{
 "file_id": "file-abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "usage_bytes": 1234,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Get - Vector Store File

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-10-01-preview

Retrieves a vector store file.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
file_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileObject	A list of files attached to a vector store.

Examples

Example

Retrieves a vector store file.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Delete - Vector Store File

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-10-01-preview

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
file_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteVectorStoreFileResponse

Examples

Example

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file_abc123",
    "object": "vector_store.file.deleted",
    "deleted": true
  }
}

Create - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-10-01-preview

Create a vector store file batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
file_ids	array	A list of File IDs that the vector store should use. Useful for tools like `file_search` that can access files.	Yes
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject	A batch of files attached to a vector store.

Examples

Example

Create a vector store file batch.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2024-10-01-preview

{
 "file_ids": [
  "file-abc123",
  "file-abc456"
 ]
}

Responses: Status Code: 200

{
  "id": "vsfb_abc123",
  "object": "vector_store.file_batch",
  "created_at": 1699061776,
  "vector_store_id": "vs_abc123",
  "status": "in_progress",
  "file_counts": {
    "in_progress": 1,
    "completed": 1,
    "failed": 0,
    "cancelled": 0,
    "total": 0
  }
}

Get - Vector Store File Batch

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-10-01-preview

Retrieves a vector store file batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
batch_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject	A batch of files attached to a vector store.

Examples

Example

Retrieves a vector store file batch.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "in_progress",
    "file_counts": {
      "in_progress": 1,
      "completed": 1,
      "failed": 0,
      "cancelled": 0,
      "total": 0
    }
  }
}

Cancel - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-10-01-preview

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
batch_id	path	Yes	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject	A batch of files attached to a vector store.

Examples

Example

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "cancelling",
    "file_counts": {
      "in_progress": 12,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 15
    }
  }
}

List - Vector Store File Batch Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-10-01-preview

Returns a list of vector store files in a batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string
batch_id	path	Yes	string
limit	query	No	integer
order	query	No	string
after	query	No	string
before	query	No	string
filter	query	No	string
api-version	query	Yes	string

Request Header

Name	Required	Type	Description
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2024-10-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Components

errorResponse

Name	Type	Description	Required	Default
error	error		No

errorBase

Name	Type	Description	Required	Default
code	string		No
message	string		No

error

Name	Type	Description	Required
param	string		No
type	string		No
inner_error	innerError	Inner error with additional details.	No

innerError

Inner error with additional details.

Name	Type	Description	Required	Default
code	innerErrorCode	Error codes for the inner error object.	No
content_filter_results	contentFilterPromptResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.	No

innerErrorCode

Error codes for the inner error object.

Description: Error codes for the inner error object.

Type: string

Default:

Enum Name: InnerErrorCode

Enum Values:

Value	Description
ResponsibleAIPolicyViolation	The prompt violated one of more content filter rules.

dalleErrorResponse

Name	Type	Description	Required	Default
error	dalleError		No

dalleError

Name	Type	Description	Required
param	string		No
type	string		No
inner_error	dalleInnerError	Inner error with additional details.	No

dalleInnerError

Inner error with additional details.

Name	Type	Description	Required
code	innerErrorCode	Error codes for the inner error object.	No
content_filter_results	dalleFilterResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.	No
revised_prompt	string	The prompt that was used to generate the image, if there was any revision to the prompt.	No

contentFilterCompletionTextSpan

Describes a span within generated completion text. Offset 0 is the first UTF32 code point of the completion text.

Name	Type	Description	Required	Default
completion_start_offset	integer	Offset of the UTF32 code point which begins the span.	Yes
completion_end_offset	integer	Offset of the first UTF32 code point which is excluded from the span. This field is always equal to completion_start_offset for empty spans. This field is always larger than completion_start_offset for nonempty spans.	Yes

contentFilterResultBase

Name	Type	Description	Required	Default
filtered	boolean		Yes

contentFilterSeverityResult

Name	Type	Description	Required	Default
filtered	boolean		Yes
severity	string		No

contentFilterDetectedResult

Name	Type	Description	Required	Default
filtered	boolean		Yes
detected	boolean		No

contentFilterDetectedWithCitationResult

Name	Type	Description	Required	Default
citation	object		No

Properties for citation

URL

Name	Type	Description	Default
URL	string

license

Name	Type	Description	Default
license	string

contentFilterDetectedWithCompletionTextSpansResult

Name	Type	Description	Required	Default
details	array		No

contentFilterIdResult

Name	Type	Description	Required	Default
filtered	boolean		Yes
id	string		No

contentFilterResultsBase

Information about the content filtering results.

Name	Type	Description	Required
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No
hate	contentFilterSeverityResult		No
self_harm	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No

contentFilterPromptResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.

Name	Type	Description	Required
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No
hate	contentFilterSeverityResult		No
self_harm	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No
jailbreak	contentFilterDetectedResult		No
indirect_attack	contentFilterDetectedResult		No

contentFilterChoiceResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about third party text and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.

Name	Type	Description	Required
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No
hate	contentFilterSeverityResult		No
self_harm	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No
protected_material_text	contentFilterDetectedResult		No
protected_material_code	contentFilterDetectedWithCitationResult		No
ungrounded_material	contentFilterDetectedWithCompletionTextSpansResult		No

contentFilterDetailedResults

Content filtering results with a detail of content filter ids for the filtered segments.

Name	Type	Description	Required	Default
filtered	boolean		Yes
details	array		No

promptFilterResult

Content filtering results for a single prompt in the request.

Name	Type	Description	Required	Default
prompt_index	integer		No
content_filter_results	contentFilterPromptResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.	No

promptFilterResults

Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.

No properties defined for this component.

dalleContentFilterResults

Information about the content filtering results.

Name	Type	Required
sexual	contentFilterSeverityResult	No
violence	contentFilterSeverityResult	No
hate	contentFilterSeverityResult	No
self_harm	contentFilterSeverityResult	No

dalleFilterResults

Name	Type	Description	Required
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No
hate	contentFilterSeverityResult		No
self_harm	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
jailbreak	contentFilterDetectedResult		No
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No

chatCompletionsRequestCommon

Name	Type	Description	Required	Default
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
stop	string or array	Up to 4 sequences where the API will stop generating further tokens.	No
max_tokens	integer	The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). This isn't compatible with o1 series models.	No	4096
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No
user	string	A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse.	No

createCompletionRequest

Name	Type	Description	Required	Default
prompt	string or array	The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <\|endoftext\|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.	Yes
best_of	integer	Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed. When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return â€“ `best_of` must be greater than `n`. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
echo	boolean	Echo back the prompt in addition to the completion	No	False
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <\|endoftext\|> token from being generated.	No	None
logprobs	integer	Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response. The maximum value for `logprobs` is 5.	No	None
max_tokens	integer	The maximum number of tokens that can be generated in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length.	No	16
n	integer	How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
seed	integer	If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to 4 sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.	No
stream	boolean	Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
suffix	string	The suffix that comes after a completion of inserted text. This parameter is only supported for `gpt-3.5-turbo-instruct`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

createCompletionResponse

Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Name	Type	Description	Required
id	string	A unique identifier for the completion.	Yes
choices	array	The list of completion choices the model generated for the input prompt.	Yes
created	integer	The Unix timestamp (in seconds) of when the completion was created.	Yes
model	string	The model used for completion.	Yes
prompt_filter_results	promptFilterResults	Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.	No
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No
object	enum	The object type, which is always "text_completion" Possible values: text_completion	Yes
usage	completionUsage	Usage statistics for the completion request.	No

createChatCompletionRequest

Name	Type	Description	Required	Default
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
stop	string or array	Up to 4 sequences where the API will stop generating further tokens.	No
max_tokens	integer	The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.	No
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No	None
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
messages	array	A list of messages comprising the conversation so far.	Yes
data_sources	array	The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.	No
logprobs	boolean	Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.	No	False
top_logprobs	integer	An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.	No
n	integer	How many chat completion choices to generate for each input message. Note that you'll be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.	No	1
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema	An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which guarantees the model will match your supplied JSON schema. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
seed	integer	This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
tools	array	A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.	No
tool_choice	chatCompletionToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.	No
function_call	string or chatCompletionFunctionCallOption	Deprecated in favor of `tool_choice`. Controls which (if any) function is called by the model. `none` means the model won't call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present.	No
functions	array	Deprecated in favor of `tools`. A list of functions the model may generate JSON inputs for.	No

chatCompletionFunctions

Name	Type	Description	Required
description	string	A description of what the function does, used by the model to choose when and how to call the function.	No
name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	Yes
parameters	FunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide](/azure/ai-services/openai/how-to/function-calling) for examples, and the JSON Schema reference for documentation about the format. Omitting `parameters` defines a function with an empty parameter list.	No

chatCompletionFunctionCallOption

Specifying a particular function via {"name": "my_function"} forces the model to call that function.

Name	Type	Description	Required	Default
name	string	The name of the function to call.	Yes

chatCompletionFunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.

No properties defined for this component.

chatCompletionRequestMessage

This component can be one of the following:

chatCompletionRequestSystemMessage

Name	Type	Description	Required
content	string or array	The contents of the system message.	Yes
role	enum	The role of the messages author, in this case `system`. Possible values: system	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No

chatCompletionRequestUserMessage

Name	Type	Description	Required
content	string or array	The contents of the user message.	Yes
role	enum	The role of the messages author, in this case `user`. Possible values: user	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No

chatCompletionRequestAssistantMessage

Name	Type	Description	Required
content	string or array	The contents of the assistant message. Required unless `tool_calls` or `function_call` is specified.	No
refusal	string	The refusal message by the assistant.	No
role	enum	The role of the messages author, in this case `assistant`. Possible values: assistant	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No
tool_calls	chatCompletionMessageToolCalls	The tool calls generated by the model, such as function calls.	No
function_call	object	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No

Properties for function_call

arguments

Name	Type	Description	Default
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

name

Name	Type	Description	Default
name	string	The name of the function to call.

chatCompletionRequestToolMessage

Name	Type	Description	Required
role	enum	The role of the messages author, in this case `tool`. Possible values: tool	Yes
content	string or array	The contents of the tool message.	Yes
tool_call_id	string	Tool call that this message is responding to.	Yes

chatCompletionRequestFunctionMessage

Name	Type	Description	Required
role	enum	The role of the messages author, in this case `function`. Possible values: function	Yes
content	string	The contents of the function message.	Yes
name	string	The name of the function to call.	Yes

chatCompletionRequestSystemMessageContentPart

This component can be one of the following:

chatCompletionRequestUserMessageContentPart

This component can be one of the following:

chatCompletionRequestAssistantMessageContentPart

This component can be one of the following:

chatCompletionRequestToolMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

Name	Type	Description	Required	Default
type	enum	The type of the content part. Possible values: text	Yes
text	string	The text content.	Yes

chatCompletionRequestMessageContentPartImage

Name	Type	Description	Required	Default
type	enum	The type of the content part. Possible values: image_url	Yes
image_url	object		Yes

Properties for image_url

url

Name	Type	Description	Default
url	string	Either a URL of the image or the base64 encoded image data.

detail

Name	Type	Description	Default
detail	string	Specifies the detail level of the image. Learn more in the Vision guide.	auto

chatCompletionRequestMessageContentPartRefusal

Name	Type	Description	Required	Default
type	enum	The type of the content part. Possible values: refusal	Yes
refusal	string	The refusal message generated by the model.	Yes

azureChatExtensionConfiguration

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

azureChatExtensionType

Description: A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat
completions request that should use Azure OpenAI chat extensions to augment the response behavior.
The use of this configuration is compatible only with Azure OpenAI.

Type: string

Default:

Enum Name: AzureChatExtensionType

Enum Values:

Value	Description
azure_search	Represents the use of Azure Search as an Azure OpenAI chat extension.
azure_cosmos_db	Represents the use of Azure Cosmos DB as an Azure OpenAI chat extension.
elasticsearch	Represents the use of Elasticsearch® index as an Azure OpenAI chat extension.
mongo_db	Represents the use of Mongo DB as an Azure OpenAI chat extension.
pinecone	Represents the use of Pinecone index as an Azure OpenAI chat extension.

azureSearchChatExtensionConfiguration

A specific representation of configurable options for Azure Search when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes
parameters	azureSearchChatExtensionParameters	Parameters for Azure Search when used as an Azure OpenAI chat extension.	No

azureSearchChatExtensionParameters

Parameters for Azure Search when used as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
authentication	onYourDataApiKeyAuthenticationOptions or onYourDataSystemAssignedManagedIdentityAuthenticationOptions or onYourDataUserAssignedManagedIdentityAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions		Yes
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No
max_search_queries	integer	The max number of rewritten queries that should be sent to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
endpoint	string	The absolute endpoint path for the Azure Search resource to use.	Yes
index_name	string	The name of the index to use as available in the referenced Azure Search resource.	Yes
fields_mapping	azureSearchIndexFieldMappingOptions	Optional settings to control how fields are processed when using a configured Azure Search resource.	No
query_type	azureSearchQueryType	The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.	No
semantic_configuration	string	The additional semantic configuration for the query.	No
filter	string	Search filter.	No
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataIntegratedVectorizationSource		No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No

azureSearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Search resource.

Name	Type	Description	Required
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
filepath_field	string	The name of the index field to use as a filepath.	No
content_fields	array	The names of index fields that should be treated as content.	No
content_fields_separator	string	The separator pattern that content fields should use.	No
vector_fields	array	The names of fields that represent vector data.	No
image_vector_fields	array	The names of fields that represent image vector data.	No

azureSearchQueryType

The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Description: The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Type: string

Default:

Enum Name: AzureSearchQueryType

Enum Values:

Value	Description
simple	Represents the default, simple query parser.
semantic	Represents the semantic query parser for advanced semantic modeling.
vector	Represents vector search over computed data.
vector_simple_hybrid	Represents a combination of the simple query strategy with vector data.
vector_semantic_hybrid	Represents a combination of semantic search and vector data querying.

azureCosmosDBChatExtensionConfiguration

A specific representation of configurable options for Azure Cosmos DB when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes
parameters	azureCosmosDBChatExtensionParameters	Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.	No

azureCosmosDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.

Name	Type	Description	Required	Default
authentication	onYourDataConnectionStringAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using a connection string.	Yes
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No
max_search_queries	integer	The max number of rewritten queries that should be sent to the search provider for one user message. If not specified, the system will decide the number of queries to send.	No
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
database_name	string	The MongoDB vCore database name to use with Azure Cosmos DB.	Yes
container_name	string	The name of the Azure Cosmos DB resource container.	Yes
index_name	string	The MongoDB vCore index name to use with Azure Cosmos DB.	Yes
fields_mapping	azureCosmosDBFieldMappingOptions	Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.	Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource		Yes
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No

azureCosmosDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.

Name	Type	Description	Required
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
filepath_field	string	The name of the index field to use as a filepath.	No
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No
vector_fields	array	The names of fields that represent vector data.	Yes

elasticsearchChatExtensionConfiguration

A specific representation of configurable options for Elasticsearch when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes
parameters	elasticsearchChatExtensionParameters	Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension.	No

elasticsearchChatExtensionParameters

Parameters to use when configuring Elasticsearch® as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
authentication	onYourDataKeyAndKeyIdAuthenticationOptions or onYourDataEncodedApiKeyAuthenticationOptions		Yes
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
endpoint	string	The endpoint of Elasticsearch®.	Yes
index_name	string	The index name of Elasticsearch®.	Yes
fields_mapping	elasticsearchIndexFieldMappingOptions	Optional settings to control how fields are processed when using a configured Elasticsearch® resource.	No
query_type	elasticsearchQueryType	The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.	No
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataModelIdVectorizationSource		No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No

elasticsearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Elasticsearch® resource.

Name	Type	Description	Required
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
filepath_field	string	The name of the index field to use as a filepath.	No
content_fields	array	The names of index fields that should be treated as content.	No
content_fields_separator	string	The separator pattern that content fields should use.	No
vector_fields	array	The names of fields that represent vector data.	No

elasticsearchQueryType

The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Description: The type of Elasticsearch® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Type: string

Default:

Enum Name: ElasticsearchQueryType

Enum Values:

Value	Description
simple	Represents the default, simple query parser.
vector	Represents vector search over computed data.

mongoDBChatExtensionConfiguration

A specific representation of configurable options for Mongo DB when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes
parameters	mongoDBChatExtensionParameters	Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.	No

mongoDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.

Name	Type	Description	Required	Default
authentication	onYourDataUsernameAndPasswordAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using a username and a password.	Yes
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
endpoint	string	The name of the Mongo DB cluster endpoint.	Yes
database_name	string	The name of the Mongo DB database.	Yes
collection_name	string	The name of the Mongo DB Collection.	Yes
app_name	string	The name of the Mongo DB Application.	Yes
index_name	string	The The name of the Mongo DB index.	Yes
fields_mapping	mongoDBFieldMappingOptions	Optional settings to control how fields are processed when using a configured Mongo DB resource.	Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource		Yes
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No

mongoDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Mongo DB resource.

Name	Type	Description	Required
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
filepath_field	string	The name of the index field to use as a filepath.	No
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No
vector_fields	array	The names of fields that represent vector data.	Yes

pineconeChatExtensionConfiguration

A specific representation of configurable options for Pinecone when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes
parameters	pineconeChatExtensionParameters	Parameters for configuring Azure OpenAI Pinecone chat extensions.	No

pineconeChatExtensionParameters

Parameters for configuring Azure OpenAI Pinecone chat extensions.

Name	Type	Description	Required	Default
authentication	onYourDataApiKeyAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using an API key.	Yes
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
environment	string	The environment name of Pinecone.	Yes
index_name	string	The name of the Pinecone database index.	Yes
fields_mapping	pineconeFieldMappingOptions	Optional settings to control how fields are processed when using a configured Pinecone resource.	Yes
embedding_dependency	onYourDataDeploymentNameVectorizationSource	The details of a a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.	Yes
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No

pineconeFieldMappingOptions

Optional settings to control how fields are processed when using a configured Pinecone resource.

Name	Type	Description	Required
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
filepath_field	string	The name of the index field to use as a filepath.	No
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No

onYourDataAuthenticationOptions

The authentication options for Azure OpenAI On Your Data.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataContextProperty

The context property.

Description: The context property.

Type: string

Default:

Enum Name: OnYourDataContextProperty

Enum Values:

Value	Description
citations	The `citations` property.
intent	The `intent` property.
all_retrieved_documents	The `all_retrieved_documents` property.

onYourDataAuthenticationType

The authentication types supported with Azure OpenAI On Your Data.

Description: The authentication types supported with Azure OpenAI On Your Data.

Type: string

Default:

Enum Name: OnYourDataAuthenticationType

Enum Values:

Value	Description
api_key	Authentication via API key.
connection_string	Authentication via connection string.
key_and_key_id	Authentication via key and key ID pair.
encoded_api_key	Authentication via encoded API key.
access_token	Authentication via access token.
system_assigned_managed_identity	Authentication via system-assigned managed identity.
user_assigned_managed_identity	Authentication via user-assigned managed identity.
username_and_password	Authentication via username and password.

onYourDataApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an API key.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
key	string	The API key to use for authentication.	No

onYourDataConnectionStringAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a connection string.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
connection_string	string	The connection string to use for authentication.	No

onYourDataKeyAndKeyIdAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch key and key ID pair.

Name	Type	Description	Required
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
key	string	The Elasticsearch key to use for authentication.	No
key_id	string	The Elasticsearch key ID to use for authentication.	No

onYourDataEncodedApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch encoded API key.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
encoded_api_key	string	The Elasticsearch encoded API key to use for authentication.	No

onYourDataAccessTokenAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using access token.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
access_token	string	The access token to use for authentication.	No

onYourDataSystemAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a system-assigned managed identity.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataUserAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a user-assigned managed identity.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
managed_identity_resource_id	string	The resource ID of the user-assigned managed identity to use for authentication.	No

onYourDataUsernameAndPasswordAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a username and a password.

Name	Type	Description	Required
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
username	string	The username to use for authentication.	No
password	string	The password. to use for authentication.	No

onYourDataVectorizationSource

An abstract representation of a vectorization source for Azure OpenAI On Your Data with vector search.

Name	Type	Description	Required	Default
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataVectorizationSourceType

Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.

Description: Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with
vector search.

Type: string

Default:

Enum Name: OnYourDataVectorizationSourceType

Enum Values:

Value	Description
endpoint	Represents vectorization performed by public service calls to an Azure OpenAI embedding model.
deployment_name	Represents an Ada model deployment name to use. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks.
integrated	Represents the integrated vectorizer defined within the search resource.
model_id	Represents a specific embedding model ID as defined in the search service. Currently only supported by Elasticsearch®.

onYourDataEndpointVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an Azure OpenAI endpoint call for embeddings.

Name	Type	Description	Required
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes
endpoint	string	Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings. The api-version query parameter isn't allowed.	No
authentication	onYourDataApiKeyAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions		No
dimensions	integer	The number of dimensions the embeddings should have. Only supported in `text-embedding-3` and later models.	No

onYourDataDeploymentNameVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.

Name	Type	Description	Required
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes
deployment_name	string	Specifies the name of the model deployment to use for vectorization. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks.	No
dimensions	integer	The number of dimensions the embeddings should have. Only supported in `text-embedding-3` and later models.	No

onYourDataIntegratedVectorizationSource

Represents the integrated vectorizer defined within the search resource.

Name	Type	Description	Required	Default
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataModelIdVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a search service model ID. Currently only supported by Elasticsearch®.

Name	Type	Description	Required	Default
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes
model_id	string	Specifies the model ID to use for vectorization. This model ID must be defined in the search service.	No

azureChatExtensionsMessageContext

A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.

Name	Type	Description	Required
citations	array	The data source retrieval result, used to generate the assistant message in the response.	No
intent	string	The detected intent from the chat history, used to pass to the next turn to carry over the context.	No
all_retrieved_documents	array	All the retrieved documents.	No

citation

citation information for a chat completions response message.

Name	Type	Description	Required
content	string	The content of the citation.	Yes
title	string	The title of the citation.	No
url	string	The URL of the citation.	No
filepath	string	The file path of the citation.	No
chunk_id	string	The chunk ID of the citation.	No
rerank_score	number	The rerank score of the retrieved document.	No

retrievedDocument

The retrieved document.

Name	Type	Description	Required
content	string	The content of the citation.	Yes
title	string	The title of the citation.	No
url	string	The URL of the citation.	No
filepath	string	The file path of the citation.	No
chunk_id	string	The chunk ID of the citation.	No
rerank_score	number	The rerank score of the retrieved document.	No
search_queries	array	The search queries used to retrieve the document.	No
data_source_index	integer	The index of the data source.	No
original_search_score	number	The original search score of the retrieved document.	No
filter_reason	filterReason	The filtering reason of the retrieved document.	No

filterReason

The filtering reason of the retrieved document.

Description: The filtering reason of the retrieved document.

Type: string

Default:

Enum Name: FilterReason

Enum Values:

Value	Description
score	The document is filtered by original search score threshold defined by `strictness` configure.
rerank	The document isn't filtered by original search score threshold, but is filtered by rerank score and `top_n_documents` configure.

chatCompletionMessageToolCall

Name	Type	Description	Required
id	string	The ID of the tool call.	Yes
type	toolCallType	The type of the tool call, in this case `function`.	Yes
function	object	The function that the model called.	Yes

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function to call.

arguments

Name	Type	Description	Default
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

toolCallType

The type of the tool call, in this case function.

Description: The type of the tool call, in this case function.

Type: string

Default:

Enum Name: ToolCallType

Enum Values:

Value	Description
function	The tool call type is function.

chatCompletionRequestMessageTool

Name	Type	Description	Required	Default
tool_call_id	string	Tool call that this message is responding to.	No
content	string	The contents of the message.	No

chatCompletionRequestMessageFunction

Name	Type	Description	Required
role	enum	The role of the messages author, in this case `function`. Possible values: function	No
name	string	The contents of the message.	No
content	string	The contents of the message.	No

createChatCompletionResponse

Represents a chat completion response returned by model, based on the provided input.

Name	Type	Description	Required
id	string	A unique identifier for the chat completion.	Yes
prompt_filter_results	promptFilterResults	Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.	No
choices	array	A list of chat completion choices. Can be more than one if `n` is greater than 1.	Yes
created	integer	The Unix timestamp (in seconds) of when the chat completion was created.	Yes
model	string	The model used for the chat completion.	Yes
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No
object	enum	The object type, which is always `chat.completion`. Possible values: chat.completion	Yes
usage	completionUsage	Usage statistics for the completion request.	No

createChatCompletionStreamResponse

Represents a streamed chunk of a chat completion response returned by model, based on the provided input.

Name	Type	Description	Required
id	string	A unique identifier for the chat completion. Each chunk has the same ID.	Yes
choices	array	A list of chat completion choices. Can contain more than one elements if `n` is greater than 1.	Yes
created	integer	The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.	Yes
model	string	The model to generate the completion.	Yes
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No
object	enum	The object type, which is always `chat.completion.chunk`. Possible values: chat.completion.chunk	Yes

chatCompletionStreamResponseDelta

A chat completion delta generated by streamed model responses.

Name	Type	Description	Required
content	string	The contents of the chunk message.	No
function_call	object	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No
tool_calls	array		No
role	enum	The role of the author of this message. Possible values: system, user, assistant, tool	No
refusal	string	The refusal message generated by the model.	No

Properties for function_call

arguments

Name	Type	Description	Default
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

name

Name	Type	Description	Default
name	string	The name of the function to call.

chatCompletionMessageToolCallChunk

Name	Type	Description	Required
index	integer		Yes
id	string	The ID of the tool call.	No
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: function	No
function	object		No

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function to call.

arguments

Name	Type	Description	Default
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

chatCompletionStreamOptions

Options for streaming response. Only set this when you set stream: true.

Name	Type	Description	Required	Default
include_usage	boolean	If set, an additional chunk will be streamed before the `data: [DONE]` message. The `usage` field on this chunk shows the token usage statistics for the entire request, and the `choices` field will always be an empty array. All other chunks will also include a `usage` field, but with a null value.	No

chatCompletionChoiceLogProbs

Log probability information for the choice.

Name	Type	Description	Required	Default
content	array	A list of message content tokens with log probability information.	Yes
refusal	array	A list of message refusal tokens with log probability information.	No

chatCompletionTokenLogprob

Name	Type	Description	Required
token	string	The token.	Yes
logprob	number	The log probability of this token.	Yes
bytes	array	A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be `null` if there is no bytes representation for the token.	Yes
top_logprobs	array	List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested `top_logprobs` returned.	Yes

chatCompletionResponseMessage

A chat completion message generated by the model.

Name	Type	Description	Required
role	chatCompletionResponseMessageRole	The role of the author of the response message.	Yes
refusal	string	The refusal message generated by the model.	Yes
content	string	The contents of the message.	Yes
tool_calls	array	The tool calls generated by the model, such as function calls.	No
function_call	chatCompletionFunctionCall	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No
context	azureChatExtensionsMessageContext	A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.	No

chatCompletionResponseMessageRole

The role of the author of the response message.

Description: The role of the author of the response message.

Type: string

Default:

Enum Values:

assistant

chatCompletionToolChoiceOption

Controls which (if any) tool is called by the model. none means the model won't call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

This component can be one of the following:

chatCompletionNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific function.

Name	Type	Description	Required	Default
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: function	Yes
function	object		Yes

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function to call.

ParallelToolCalls

Whether to enable parallel function calling during tool use.

No properties defined for this component.

chatCompletionMessageToolCalls

The tool calls generated by the model, such as function calls.

No properties defined for this component.

chatCompletionFunctionCall

Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model.

Name	Type	Description	Required	Default
name	string	The name of the function to call.	Yes
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model doesn't always generate valid JSON, and may generate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	Yes

completionUsage

Usage statistics for the completion request.

Name	Type	Description	Required
prompt_tokens	integer	Number of tokens in the prompt.	Yes
completion_tokens	integer	Number of tokens in the generated completion.	Yes
total_tokens	integer	Total number of tokens used in the request (prompt + completion).	Yes
prompt_tokens_details	object	Details of the prompt tokens.	No
completion_tokens_details	object	Breakdown of tokens used in a completion.	No

Properties for prompt_tokens_details

cached_tokens

Name	Type	Description	Default
cached_tokens	integer	The number of cached prompt tokens.

Properties for completion_tokens_details

reasoning_tokens

Name	Type	Description	Default
reasoning_tokens	integer	Tokens generated by the model for reasoning.

chatCompletionTool

Name	Type	Description	Required	Default
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: function	Yes
function	FunctionObject		Yes

FunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.

No properties defined for this component.

FunctionObject

Name	Type	Description	Required	Default
description	string	A description of what the function does, used by the model to choose when and how to call the function.	No
name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	Yes
parameters	FunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format. Omitting `parameters` defines a function with an empty parameter list.	No
strict	boolean	Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the `parameters` field. Only a subset of JSON Schema is supported when `strict` is `true`.	No	False

ResponseFormatText

Name	Type	Description	Required	Default
type	enum	The type of response format being defined: `text` Possible values: text	Yes

ResponseFormatJsonObject

Name	Type	Description	Required	Default
type	enum	The type of response format being defined: `json_object` Possible values: json_object	Yes

ResponseFormatJsonSchemaSchema

The schema for the response format, described as a JSON Schema object.

No properties defined for this component.

ResponseFormatJsonSchema

Name	Type	Description	Required	Default
type	enum	The type of response format being defined: `json_schema` Possible values: json_schema	Yes
json_schema	object		Yes

Properties for json_schema

description

Name	Type	Description	Default
description	string	A description of what the response format is for, used by the model to determine how to respond in the format.

name

Name	Type	Description	Default
name	string	The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

schema

Name	Type	Description	Default
schema	ResponseFormatJsonSchemaSchema	The schema for the response format, described as a JSON Schema object.

strict

Name	Type	Description	Default
strict	boolean	Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the `schema` field. Only a subset of JSON Schema is supported when `strict` is `true`.	False

chatCompletionChoiceCommon

Name	Type	Description	Required	Default
index	integer		No
finish_reason	string		No

createTranslationRequest

Translation request.

Name	Type	Description	Required	Default
file	string	The audio file to translate.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0

audioResponse

Translation or transcription response when response_format was json

Name	Type	Description	Required	Default
text	string	Translated or transcribed text.	Yes

audioVerboseResponse

Translation or transcription response when response_format was verbose_json

Name	Type	Description	Required
text	string	Translated or transcribed text.	Yes
task	string	Type of audio task.	No
language	string	Language.	No
duration	number	Duration.	No
segments	array		No
words	array		No

audioResponseFormat

Defines the format of the output.

Description: Defines the format of the output.

Type: string

Default:

Enum Values:

json
text
srt
verbose_json
vtt

createTranscriptionRequest

Transcription request.

Name	Type	Description	Required	Default
file	string	The audio file object to transcribe.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0
language	string	The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.	No
timestamp_granularities[]	array	The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There's no additional latency for segment timestamps, but generating word timestamps incurs additional latency.	No	['segment']

audioSegment

Transcription or translation segment.

Name	Type	Description	Required
id	integer	Segment identifier.	No
seek	number	Offset of the segment.	No
start	number	Segment start offset.	No
end	number	Segment end offset.	No
text	string	Segment text.	No
tokens	array	Tokens of the text.	No
temperature	number	Temperature.	No
avg_logprob	number	Average log probability.	No
compression_ratio	number	Compression ratio.	No
no_speech_prob	number	Probability of 'no speech'.	No

audioWord

Transcription or translation word.

Name	Type	Description	Required
word	string	Word	No
start	number	Word start offset.	No
end	number	Word end offset.	No

createSpeechRequest

Speech request.

Name	Type	Description	Required	Default
input	string	The text to synthesize audio for. The maximum length is 4096 characters.	Yes
voice	enum	The voice to use for speech synthesis. Possible values: alloy, echo, fable, onyx, nova, shimmer	Yes
response_format	enum	The format to synthesize the audio in. Possible values: mp3, opus, aac, flac, wav, pcm	No
speed	number	The speed of the synthesized audio. Select a value from `0.25` to `4.0`. `1.0` is the default.	No	1.0

imageQuality

The quality of the image that will be generated.

Description: The quality of the image that will be generated.

Type: string

Default: standard

Enum Name: Quality

Enum Values:

Value	Description
standard	Standard quality creates images with standard quality.
hd	HD quality creates images with finer details and greater consistency across the image.

imagesResponseFormat

The format in which the generated images are returned.

Description: The format in which the generated images are returned.

Type: string

Default: url

Enum Name: ImagesResponseFormat

Enum Values:

Value	Description
url	The URL that provides temporary access to download the generated images.
b64_json	The generated images are returned as base64 encoded string.

imageSize

The size of the generated images.

Description: The size of the generated images.

Type: string

Default: 1024x1024

Enum Name: Size

Enum Values:

Value	Description
256x256	The desired size of the generated image is 256x256 pixels. Only supported for dall-e-2.
512x512	The desired size of the generated image is 512x512 pixels. Only supported for dall-e-2.
1792x1024	The desired size of the generated image is 1792x1024 pixels. Only supported for dall-e-3.
1024x1792	The desired size of the generated image is 1024x1792 pixels. Only supported for dall-e-3.
1024x1024	The desired size of the generated image is 1024x1024 pixels.

imageStyle

The style of the generated images.

Description: The style of the generated images.

Type: string

Default: vivid

Enum Name: Style

Enum Values:

Value	Description
vivid	Vivid creates images that are hyper-realistic and dramatic.
natural	Natural creates images that are more natural and less hyper-realistic.

imageGenerationsRequest

Name	Type	Description	Required	Default
prompt	string	A text description of the desired image(s). The maximum length is 4000 characters.	Yes
n	integer	The number of images to generate.	No	1
size	imageSize	The size of the generated images.	No	1024x1024
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
quality	imageQuality	The quality of the image that will be generated.	No	standard
style	imageStyle	The style of the generated images.	No	vivid

generateImagesResponse

Name	Type	Description	Required	Default
created	integer	The unix timestamp when the operation was created.	Yes
data	array	The result data of the operation, if successful	Yes

imageResult

The image url or encoded image if successful, and an error otherwise.

Name	Type	Description	Required
url	string	The image url.	No
b64_json	string	The base64 encoded image	No
content_filter_results	dalleContentFilterResults	Information about the content filtering results.	No
revised_prompt	string	The prompt that was used to generate the image, if there was any revision to the prompt.	No
prompt_filter_results	dalleFilterResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer blocklist, if it has been filtered and its id.	No

line

A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.

Name	Type	Description	Required	Default
text	string		Yes
spans	array	An array of spans that represent detected objects and its bounding box information.	Yes

span

A span object that represents a detected object and its bounding box information.

Name	Type	Description	Required
text	string	The text content of the span that represents the detected object.	Yes
offset	integer	The character offset within the text where the span begins. This offset is defined as the position of the first character of the span, counting from the start of the text as Unicode codepoints.	Yes
length	integer	The length of the span in characters, measured in Unicode codepoints.	Yes
polygon	array	An array of objects representing points in the polygon that encloses the detected object.	Yes

runCompletionUsage

Usage statistics related to the run. This value will be null if the run isn't in a terminal state (i.e. in_progress, queued, etc.).

Name	Type	Description	Required
completion_tokens	integer	Number of completion tokens used over the course of the run.	Yes
prompt_tokens	integer	Number of prompt tokens used over the course of the run.	Yes
total_tokens	integer	Total number of tokens used (prompt + completion).	Yes

runStepCompletionUsage

Usage statistics related to the run step. This value will be null while the run step's status is in_progress.

Name	Type	Description	Required
completion_tokens	integer	Number of completion tokens used over the course of the run step.	Yes
prompt_tokens	integer	Number of prompt tokens used over the course of the run step.	Yes
total_tokens	integer	Total number of tokens used (prompt + completion).	Yes

assistantsApiResponseFormatOption

Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

This component can be one of the following:

assistantsApiResponseFormat

An object describing the expected output of the model. If json_object only function type tools are allowed to be passed to the Run. If text the model can return text or any value needed.

Name	Type	Description	Required	Default
type	string	Must be one of `text` or `json_object`.	No	text

type Enum: AssistantsApiResponseFormat

Value	Description
text
json_object

assistantObject

Represents an assistant that can call the model and use tools.

Name	Type	Description	Required	Default
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `assistant`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the assistant was created.	Yes
name	string	The name of the assistant. The maximum length is 256 characters.	Yes
description	string	The description of the assistant. The maximum length is 512 characters.	Yes
model	string	ID of the model to use. You can use the List models API to see all of your available models.	Yes
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	Yes
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `file_search`, or `function`.	Yes	[]
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.

object Enum: AssistantObjectType

Value	Description
assistant	The object type, which is always assistant

createAssistantRequest

Name	Type	Description	Required	Default
model			Yes
name	string	The name of the assistant. The maximum length is 256 characters.	No
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
vector_stores	array	A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.

modifyAssistantRequest

Name	Type	Description	Required	Default
model			No
name	string	The name of the assistant. The maximum length is 256 characters.	No
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 32768 characters.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	Overrides the list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.

deleteAssistantResponse

Name	Type	Required
id	string	Yes
deleted	boolean	Yes
object	string	Yes

object Enum: DeleteAssistantResponseState

Value	Description
assistant.deleted

listAssistantsResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

assistantToolsCode

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `code_interpreter`	Yes

type Enum: assistantToolsCodeType

Value	Description
code_interpreter

assistantToolsFileSearch

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `file_search`	Yes
file_search	object	Overrides for the file search tool.	No

Properties for file_search

max_num_results

Name	Type	Description	Default
max_num_results	integer	The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive. Note that the file search tool may output fewer than `max_num_results` results.

type Enum: assistantToolsFileSearchType

Value	Description
file_search

assistantToolsFileSearchTypeOnly

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `file_search`	Yes

type Enum: assistantToolsFileSearchType

Value	Description
file_search

assistantToolsFunction

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `function`	Yes
function	object	The function definition.	Yes

Properties for function

description

Name	Type	Description	Default
description	string	A description of what the function does, used by the model to choose when and how to call the function.

name

Name	Type	Description	Default
name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

parameters

Name	Type	Description	Default
parameters	chatCompletionFunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.

type Enum: assistantToolsFunction

Value	Description
function

truncationObject

Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.

Name	Type	Description	Required	Default
type	string	The truncation strategy to use for the thread. The default is `auto`. If set to `last_messages`, the thread will be truncated to the n most recent messages in the thread. When set to `auto`, messages in the middle of the thread will be dropped to fit the context length of the model, `max_prompt_tokens`.	Yes
last_messages	integer	The number of most recent messages from the thread when constructing the context for the run.	No

type Enum: TruncationType

Value	Description
auto
last_messages

assistantsApiToolChoiceOption

Controls which (if any) tool is called by the model. none means the model won't call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

This component can be one of the following:

assistantsNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific tool.

Name	Type	Description	Required	Default
type	string	The type of the tool. If type is `function`, the function name must be set	Yes
function	object		No

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function to call.

type Enum: AssistantsNamedToolChoiceType

Value	Description
function
code_interpreter
file_search

runObject

Represents an execution run on a thread.

Name	Type	Description	Required	Default
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.run`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the run was created.	Yes
thread_id	string	The ID of the thread that was executed on as a part of this run.	Yes
assistant_id	string	The ID of the assistant used for execution of this run.	Yes
status	string	The status of the run, which can be either `queued`, `in_progress`, `requires_action`, `cancelling`, `cancelled`, `failed`, `completed`, or `expired`.	Yes
required_action	object	Details on the action required to continue the run. Will be `null` if no action is required.	Yes
last_error	object	The last error associated with this run. Will be `null` if there are no errors.	Yes
expires_at	integer	The Unix timestamp (in seconds) for when the run will expire.	Yes
started_at	integer	The Unix timestamp (in seconds) for when the run was started.	Yes
cancelled_at	integer	The Unix timestamp (in seconds) for when the run was cancelled.	Yes
failed_at	integer	The Unix timestamp (in seconds) for when the run failed.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the run was completed.	Yes
incomplete_details	object	Details on why the run is incomplete. Will be `null` if the run isn't incomplete.	Yes
model	string	The model that the assistant used for this run.	Yes
instructions	string	The instructions that the assistant used for this run.	Yes
tools	array	The list of tools that the assistant used for this run.	Yes	[]
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
usage	runCompletionUsage	Usage statistics related to the run. This value will be `null` if the run isn't in a terminal state (i.e. `in_progress`, `queued`, etc.).	Yes
temperature	number	The sampling temperature used for this run. If not set, defaults to 1.	No
top_p	number	The nucleus sampling value used for this run. If not set, defaults to 1.	No
max_prompt_tokens	integer	The maximum number of prompt tokens specified to have been used over the course of the run.	Yes
max_completion_tokens	integer	The maximum number of completion tokens specified to have been used over the course of the run.	Yes
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	Yes
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	Yes
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	Yes

Properties for required_action

type

Name	Type	Description	Default
type	string	For now, this is always `submit_tool_outputs`.

submit_tool_outputs

Name	Type	Description	Default
tool_calls	array	A list of the relevant tool calls.

Properties for last_error

code

Name	Type	Description	Default
code	string	One of `server_error` or `rate_limit_exceeded`.

message

Name	Type	Description	Default
message	string	A human-readable description of the error.

Properties for incomplete_details

reason

Name	Type	Description	Default
reason	string	The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run.

object Enum: runObjectType

Value	Description
thread.run	The run object type which is always thread.run

status Enum: RunObjectStatus

Value	Description
queued	The queued state
in_progress	The in_progress state
requires_action	The required_action state
cancelling	The cancelling state
cancelled	The cancelled state
failed	The failed state
completed	The completed state
expired	The expired state

createRunRequest

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
model	string	The deployment ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
additional_instructions	string	Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.	No
additional_messages	array	Adds additional messages to the thread before creating the run.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

listRunsResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

modifyRunRequest

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

submitToolOutputsRunRequest

Name	Type	Description	Required	Default
tool_outputs	array	A list of tools for which the outputs are being submitted.	Yes
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No

runToolCallObject

Tool call objects

Name	Type	Description	Required
id	string	The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the Submit tool outputs to run endpoint.	Yes
type	string	The type of tool call the output is required for. For now, this is always `function`.	Yes
function	object	The function definition.	Yes

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function.

arguments

Name	Type	Description	Default
arguments	string	The arguments that the model expects you to pass to the function.

type Enum: RunToolCallObjectType

Value	Description
function

createThreadAndRunRequest

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
thread	createThreadRequest		No
model	string	The deployment name of the model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.

threadObject

Represents a thread that contains messages.

Name	Type	Description	Required
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the thread was created.	Yes
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.

object Enum: ThreadObjectType

Value	Description
thread	The type of thread object which is always `thread`

createThreadRequest

Name	Type	Description	Required
messages	array	A list of messages to start the thread with.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.
vector_stores	array	A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread.

modifyThreadRequest

Name	Type	Description	Required	Default
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Properties for tool_resources

code_interpreter

Name	Type	Description	Default
file_ids	array	A list of File IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	[]

file_search

Name	Type	Description	Default
vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.

deleteThreadResponse

Name	Type	Required
id	string	Yes
deleted	boolean	Yes
object	string	Yes

object Enum: DeleteThreadResponseObjectState

Value	Description
thread.deleted	The delete thread response object state which is `thread.deleted`

listThreadsResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

messageObject

Represents a message within a thread.

Name	Type	Description	Required
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.message`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the message was created.	Yes
thread_id	string	the thread ID that this message belongs to.	Yes
status	string	The status of the message, which can be either `in_progress`, `incomplete`, or `completed`.	Yes
incomplete_details	object	On an incomplete message, details about why the message is incomplete.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the message was completed.	Yes
incomplete_at	integer	The Unix timestamp (in seconds) for when the message was marked as incomplete.	Yes
role	string	The entity that produced the message. One of `user` or `assistant`.	Yes
content	array	The content of the message in array of text and/or images.	Yes
assistant_id	string	If applicable, the ID of the assistant that authored this message.	Yes
run_id	string	If applicable, the ID of the run associated with the authoring of this message.	Yes
attachments	array	A list of files attached to the message, and the tools they were added to.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes

Properties for incomplete_details

reason

Name	Type	Description	Default
reason	string	The reason the message is incomplete.

object Enum: MessageObjectType

Value	Description
thread.message	The message object type which is `thread.message`

status Enum: MessageObjectStatus

Value	Description
in_progress
incomplete
completed

role Enum: MessageObjectRole

Value	Description
user
assistant

messageDeltaObject

Represents a message delta i.e. any changed fields on a message during streaming.

Name	Type	Description	Required
id	string	The identifier of the message, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.message.delta`.	Yes
delta	object	The delta containing the fields that have changed on the Message.	Yes

Properties for delta

role

Name	Type	Description	Default
role	string	The entity that produced the message. One of `user` or `assistant`.

content

Name	Type	Description	Default
content	array	The content of the message in array of text and/or images.

object Enum: MessageDeltaObjectType

Value	Description
thread.message.delta

createMessageRequest

Name	Type	Description	Required
role	string	The role of the entity that is creating the message. Allowed values include: - `user`: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. - `assistant`: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.	Yes
content	string	The content of the message.	Yes
attachments	array	A list of files attached to the message, and the tools they should be added to.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

role Enum: CreateMessageRequestRole

Value	Description
user
assistant

modifyMessageRequest

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

deleteMessageResponse

Name	Type	Required
id	string	Yes
deleted	boolean	Yes
object	string	Yes

object Enum: DeleteMessageResponseObject

Value	Description
thread.message.deleted	The delete message response object state

listMessagesResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

messageContentImageFileObject

References an image File in the content of a message.

Name	Type	Description	Required	Default
type	string	Always `image_file`.	Yes
image_file	object		Yes

Properties for image_file

file_id

Name	Type	Description	Default
file_id	string	The File ID of the image in the message content.

type Enum: MessageContentImageFileObjectType

Value	Description
image_file	The message content image file type

messageContentTextObject

The text content that is part of a message.

Name	Type	Description	Required	Default
type	string	Always `text`.	Yes
text	object		Yes

Properties for text

value

Name	Type	Description	Default
value	string	The data that makes up the text.

annotations

Name	Type	Description	Default
annotations	array

type Enum: messageContentTextObjectType

Value	Description
text	The message content text Object type

messageContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files.

Name	Type	Description	Required
type	string	Always `file_citation`.	Yes
text	string	The text in the message content that needs to be replaced.	Yes
file_citation	object		Yes
start_index	integer		Yes
end_index	integer		Yes

Properties for file_citation

file_id

Name	Type	Description	Default
file_id	string	The ID of the specific File the citation is from.

type Enum: FileCitationObjectType

Value	Description
file_citation	The file citation object type

messageContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name	Type	Description	Required
type	string	Always `file_path`.	Yes
text	string	The text in the message content that needs to be replaced.	Yes
file_path	object		Yes
start_index	integer		Yes
end_index	integer		Yes

Properties for file_path

file_id

Name	Type	Description	Default
file_id	string	The ID of the file that was generated.

type Enum: FilePathObjectType

Value	Description
file_path	The file path object type

messageDeltaContentImageFileObject

References an image File in the content of a message.

Name	Type	Description	Required
index	integer	The index of the content part in the message.	Yes
type	string	Always `image_file`.	Yes
image_file	object		No

Properties for image_file

file_id

Name	Type	Description	Default
file_id	string	The File ID of the image in the message content.

type Enum: MessageDeltaContentImageFileObjectType

Value	Description
image_file

messageDeltaContentTextObject

The text content that is part of a message.

Name	Type	Description	Required
index	integer	The index of the content part in the message.	Yes
type	string	Always `text`.	Yes
text	object		No

Properties for text

value

Name	Type	Description	Default
value	string	The data that makes up the text.

annotations

Name	Type	Description	Default
annotations	array

type Enum: MessageDeltaContentTextObjectType

Value	Description
text

messageDeltaContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.

Name	Type	Description	Required
index	integer	The index of the annotation in the text content part.	Yes
type	string	Always `file_citation`.	Yes
text	string	The text in the message content that needs to be replaced.	No
file_citation	object		No
start_index	integer		No
end_index	integer		No

Properties for file_citation

file_id

Name	Type	Description	Default
file_id	string	The ID of the specific File the citation is from.

quote

Name	Type	Description	Default
quote	string	The specific quote in the file.

type Enum: MessageDeltaContentTextAnnotationsFileCitationObjectType

Value	Description
file_citation

messageDeltaContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name	Type	Description	Required
index	integer	The index of the annotation in the text content part.	Yes
type	string	Always `file_path`.	Yes
text	string	The text in the message content that needs to be replaced.	No
file_path	object		No
start_index	integer		No
end_index	integer		No

Properties for file_path

file_id

Name	Type	Description	Default
file_id	string	The ID of the file that was generated.

type Enum: MessageDeltaContentTextAnnotationsFilePathObjectType

Value	Description
file_path

runStepObject

Represents a step in execution of a run.

Name	Type	Description	Required
id	string	The identifier of the run step, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `assistant.run.step`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the run step was created.	Yes
assistant_id	string	The ID of the assistant associated with the run step.	Yes
thread_id	string	The ID of the thread that was run.	Yes
run_id	string	The ID of the run that this run step is a part of.	Yes
type	string	The type of run step, which can be either `message_creation` or `tool_calls`.	Yes
status	string	The status of the run, which can be either `in_progress`, `cancelled`, `failed`, `completed`, or `expired`.	Yes
step_details	runStepDetailsMessageCreationObject or runStepDetailsToolCallsObject	The details of the run step.	Yes
last_error	object	The last error associated with this run step. Will be `null` if there are no errors.	Yes
expired_at	integer	The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired.	Yes
cancelled_at	integer	The Unix timestamp (in seconds) for when the run step was cancelled.	Yes
failed_at	integer	The Unix timestamp (in seconds) for when the run step failed.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the run step completed.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes

Properties for last_error

code

Name	Type	Description	Default
code	string	One of `server_error` or `rate_limit_exceeded`.

message

Name	Type	Description	Default
message	string	A human-readable description of the error.

object Enum: RunStepObjectType

Value	Description
assistant.run.step	The object type, which is always `assistant.run.step`

type Enum: RunStepObjectType

Value	Description
message_creation	The message_creation run step
tool_calls	The tool_calls run step

status Enum: RunStepObjectStatus

Value	Description
in_progress	The in_progress run status
cancelled	The cancelled run status
failed	The cancelled run status
completed	The cancelled run status
expired	The cancelled run status

runStepDeltaObject

Represents a run step delta i.e. any changed fields on a run step during streaming.

Name	Type	Description	Required
id	string	The identifier of the run step, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.run.step.delta`.	Yes
delta	object	The delta containing the fields that have changed on the run step.	Yes

Properties for delta

step_details

Name	Type	Description	Default
step_details	runStepDeltaStepDetailsMessageCreationObject or runStepDeltaStepDetailsToolCallsObject	The details of the run step.

object Enum: RunStepDeltaObjectType

Value	Description
thread.run.step.delta

listRunStepsResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

runStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name	Type	Description	Required	Default
type	string	Always `message_creation`.	Yes
message_creation	object		Yes

Properties for message_creation

message_id

Name	Type	Description	Default
message_id	string	The ID of the message that was created by this run step.

type Enum: RunStepDetailsMessageCreationObjectType

Value	Description
message_creation

runStepDeltaStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name	Type	Description	Required	Default
type	string	Always `message_creation`.	Yes
message_creation	object		No

Properties for message_creation

message_id

Name	Type	Description	Default
message_id	string	The ID of the message that was created by this run step.

type Enum: RunStepDeltaStepDetailsMessageCreationObjectType

Value	Description
message_creation

runStepDetailsToolCallsObject

Details of the tool call.

Name	Type	Description	Required	Default
type	string	Always `tool_calls`.	Yes
tool_calls	array	An array of tool calls the run step was involved in. These can be associated with one of three types of tools: `code_interpreter`, `retrieval` or `function`.	Yes

type Enum: RunStepDetailsToolCallsObjectType

Value	Description
tool_calls

runStepDeltaStepDetailsToolCallsObject

Details of the tool call.

Name	Type	Description	Required	Default
type	string	Always `tool_calls`.	Yes
tool_calls	array	An array of tool calls the run step was involved in. These can be associated with one of three types of tools: `code_interpreter`, `file_search` or `function`.	No

type Enum: RunStepDeltaStepDetailsToolCallsObjectType

Value	Description
tool_calls

runStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name	Type	Description	Required
id	string	The ID of the tool call.	Yes
type	string	The type of tool call. This is always going to be `code_interpreter` for this type of tool call.	Yes
code_interpreter	object	The Code Interpreter tool call definition.	Yes

Properties for code_interpreter

input

Name	Type	Description	Default
input	string	The input to the Code Interpreter tool call.

outputs

Name	Type	Description	Default
outputs	array	The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (`logs`) or images (`image`). Each of these are represented by a different object type.

type Enum: RunStepDetailsToolCallsCodeObjectType

Value	Description
code_interpreter

runStepDeltaStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name	Type	Description	Required
index	integer	The index of the tool call in the tool calls array.	Yes
id	string	The ID of the tool call.	No
type	string	The type of tool call. This is always going to be `code_interpreter` for this type of tool call.	Yes
code_interpreter	object	The Code Interpreter tool call definition.	No

Properties for code_interpreter

input

Name	Type	Description	Default
input	string	The input to the Code Interpreter tool call.

outputs

Name	Type	Description	Default
outputs	array	The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (`logs`) or images (`image`). Each of these are represented by a different object type.

type Enum: RunStepDeltaStepDetailsToolCallsCodeObjectType

Value	Description
code_interpreter

runStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name	Type	Description	Required	Default
type	string	Always `logs`.	Yes
logs	string	The text output from the Code Interpreter tool call.	Yes

type Enum: RunStepDetailsToolCallsCodeOutputLogsObjectType

Value	Description
logs

runStepDeltaStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name	Type	Description	Required
index	integer	The index of the output in the outputs array.	Yes
type	string	Always `logs`.	Yes
logs	string	The text output from the Code Interpreter tool call.	No

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputLogsObjectType

Value	Description
logs

runStepDetailsToolCallsCodeOutputImageObject

Name	Type	Description	Required	Default
type	string	Always `image`.	Yes
image	object		Yes

Properties for image

file_id

Name	Type	Description	Default
file_id	string	The File ID of the image.

type Enum: RunStepDetailsToolCallsCodeOutputImageObjectType

Value	Description
image

runStepDeltaStepDetailsToolCallsCodeOutputImageObject

Name	Type	Description	Required
index	integer	The index of the output in the outputs array.	Yes
type	string	Always `image`.	Yes
image	object		No

Properties for image

file_id

Name	Type	Description	Default
file_id	string	The file ID of the image.

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputImageObject

Value	Description
image

runStepDetailsToolCallsFileSearchObject

Name	Type	Description	Required
id	string	The ID of the tool call object.	Yes
type	string	The type of tool call. This is always going to be `file_search` for this type of tool call.	Yes
file_search	object	For now, this is always going to be an empty object.	Yes

type Enum: RunStepDetailsToolCallsFileSearchObjectType

Value	Description
file_search

runStepDeltaStepDetailsToolCallsFileSearchObject

Name	Type	Description	Required
index	integer	The index of the tool call in the tool calls array.	Yes
id	string	The ID of the tool call object.	No
type	string	The type of tool call. This is always going to be `retrieval` for this type of tool call.	Yes
file_search	object	For now, this is always going to be an empty object.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsFileSearchObjectType

Value	Description
file_search

runStepDetailsToolCallsFunctionObject

Name	Type	Description	Required
id	string	The ID of the tool call object.	Yes
type	string	The type of tool call. This is always going to be `function` for this type of tool call.	Yes
function	object	The definition of the function that was called.	Yes

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function.

arguments

Name	Type	Description	Default
arguments	string	The arguments passed to the function.

output

Name	Type	Description	Default
output	string	The output of the function. This will be `null` if the outputs have not been submitted yet.

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value	Description
function

runStepDeltaStepDetailsToolCallsFunctionObject

Name	Type	Description	Required
index	integer	The index of the tool call in the tool calls array.	Yes
id	string	The ID of the tool call object.	No
type	string	The type of tool call. This is always going to be `function` for this type of tool call.	Yes
function	object	The definition of the function that was called.	No

Properties for function

name

Name	Type	Description	Default
name	string	The name of the function.

arguments

Name	Type	Description	Default
arguments	string	The arguments passed to the function.

output

Name	Type	Description	Default
output	string	The output of the function. This will be `null` if the outputs have not been submitted yet.

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value	Description
function

vectorStoreExpirationAfter

The expiration policy for a vector store.

Name	Type	Description	Required	Default
anchor	string	Anchor timestamp after which the expiration policy applies. Supported anchors: `last_active_at`.	Yes
days	integer	The number of days after the anchor time that the vector store will expire.	Yes

anchor Enum: VectorStoreExpirationAfterAnchor

Value	Description
last_active_at	The anchor timestamp after which the expiration policy applies.

vectorStoreObject

A vector store is a collection of processed files can be used by the file_search tool.

Name	Type	Description	Required
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	enum	The object type, which is always `vector_store`. Possible values: vector_store	Yes
created_at	integer	The Unix timestamp (in seconds) for when the vector store was created.	Yes
name	string	The name of the vector store.	Yes
usage_bytes	integer	The total number of bytes used by the files in the vector store.	Yes
file_counts	object		Yes
status	string	The status of the vector store, which can be either `expired`, `in_progress`, or `completed`. A status of `completed` indicates that the vector store is ready for use.	Yes
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
expires_at	integer	The Unix timestamp (in seconds) for when the vector store will expire.	No
last_active_at	integer	The Unix timestamp (in seconds) for when the vector store was last active.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes

Properties for file_counts

in_progress

Name	Type	Description	Default
in_progress	integer	The number of files that are currently being processed.

completed

Name	Type	Description	Default
completed	integer	The number of files that have been successfully processed.

failed

Name	Type	Description	Default
failed	integer	The number of files that have failed to process.

cancelled

Name	Type	Description	Default
cancelled	integer	The number of files that were cancelled.

total

Name	Type	Description	Default
total	integer	The total number of files.

status Enum: VectorStoreObjectStatus

Value	Description
expired
in_progress
completed

createVectorStoreRequest

Name	Type	Description	Required
file_ids	array	A list of file IDs that the vector store should use. Useful for tools like `file_search` that can access files.	No
name	string	The name of the vector store.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is nonempty.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

updateVectorStoreRequest

Name	Type	Description	Required
name	string	The name of the vector store.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

listVectorStoresResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

deleteVectorStoreResponse

Name	Type	Required
id	string	Yes
deleted	boolean	Yes
object	string	Yes

object Enum: DeleteVectorStoreResponseObject

Value	Description
vector_store.deleted	The delete vector store response object state

vectorStoreFileObject

A list of files attached to a vector store.

Name	Type	Description	Required
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `vector_store.file`.	Yes
usage_bytes	integer	The total vector store usage in bytes. Note that this may be different from the original file size.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the vector store file was created.	Yes
vector_store_id	string	The ID of the vector store that the file is attached to.	Yes
status	string	The status of the vector store file, which can be either `in_progress`, `completed`, `cancelled`, or `failed`. The status `completed` indicates that the vector store file is ready for use.	Yes
last_error	object	The last error associated with this vector store file. Will be `null` if there are no errors.	Yes
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is nonempty.	No

Properties for last_error

code

Name	Type	Description	Default
code	string	One of `server_error` or `rate_limit_exceeded`.

message

Name	Type	Description	Default
message	string	A human-readable description of the error.

object Enum: VectorStoreFileObjectType

Value	Description
vector_store.file

status Enum: VectorStoreFileObjectStatus

Value	Description
in_progress
completed
cancelled
failed

otherChunkingStrategyResponseParam

This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy concept was introduced in the API.

Name	Type	Description	Required	Default
type	string	Always `other`.	Yes

type Enum: OtherChunkingStrategyResponseParamType

Value	Description
other

staticChunkingStrategyResponseParam

Name	Type	Description	Required	Default
type	string	Always `static`.	Yes
static	staticChunkingStrategy		Yes

type Enum: StaticChunkingStrategyResponseParamType

Value	Description
static

staticChunkingStrategy

Name	Type	Description	Required	Default
max_chunk_size_tokens	integer	The maximum number of tokens in each chunk. The default value is `800`. The minimum value is `100` and the maximum value is `4096`.	Yes
chunk_overlap_tokens	integer	The number of tokens that overlap between chunks. The default value is `400`. Note that the overlap must not exceed half of `max_chunk_size_tokens`.	Yes

autoChunkingStrategyRequestParam

The default strategy. This strategy currently uses a max_chunk_size_tokens of 800 and chunk_overlap_tokens of 400.

Name	Type	Description	Required	Default
type	enum	Always `auto`. Possible values: auto	Yes

staticChunkingStrategyRequestParam

Name	Type	Description	Required	Default
type	enum	Always `static`. Possible values: static	Yes
static	staticChunkingStrategy		Yes

chunkingStrategyRequestParam

The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.

This component can be one of the following:

createVectorStoreFileRequest

Name	Type	Description	Required	Default
file_id	string	A File ID that the vector store should use. Useful for tools like `file_search` that can access files.	Yes
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No

listVectorStoreFilesResponse

Name	Type	Required
object	string	Yes
data	array	Yes
first_id	string	Yes
last_id	string	Yes
has_more	boolean	Yes

deleteVectorStoreFileResponse

Name	Type	Required
id	string	Yes
deleted	boolean	Yes
object	string	Yes

object Enum: DeleteVectorStoreFileResponseObject

Value	Description
vector_store.file.deleted

vectorStoreFileBatchObject

A batch of files attached to a vector store.

Name	Type	Description	Required
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `vector_store.file_batch`.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the vector store files batch was created.	Yes
vector_store_id	string	The ID of the vector store that the File is attached to.	Yes
status	string	The status of the vector store files batch, which can be either `in_progress`, `completed`, `cancelled` or `failed`.	Yes
file_counts	object		Yes

Properties for file_counts

in_progress

Name	Type	Description	Default
in_progress	integer	The number of files that are currently being processed.

completed

Name	Type	Description	Default
completed	integer	The number of files that have been processed.

failed

Name	Type	Description	Default
failed	integer	The number of files that have failed to process.

cancelled

Name	Type	Description	Default
cancelled	integer	The number of files that where cancelled.

total

Name	Type	Description	Default
total	integer	The total number of files.

object Enum: VectorStoreFileBatchObjectType

Value	Description
vector_store.files_batch

status Enum: VectorStoreFileBatchObjectStatus

Value	Description
in_progress
completed
cancelled
failed

createVectorStoreFileBatchRequest

Name	Type	Description	Required	Default
file_ids	array	A list of File IDs that the vector store should use. Useful for tools like `file_search` that can access files.	Yes
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No

assistantStreamEvent

Represents an event emitted when streaming a Run.

Each event in a server-sent events stream has an event and data property:

event: thread.created
data: {"id": "thread_123", "object": "thread", ...}

We emit events whenever a new object is created, transitions to a new state, or is being streamed in parts (deltas). For example, we emit thread.run.created when a new run is created, thread.run.completed when a run completes, and so on. When an Assistant chooses to create a message during a run, we emit a thread.message.created event, a thread.message.in_progress event, many thread.message.delta events, and finally a thread.message.completed event.

We may add additional events over time, so we recommend handling unknown events gracefully in your code.

This component can be one of the following:

threadStreamEvent

This component can be one of the following:

thread.created

Occurs when a new thread is created.

Name	Type	Description	Required	Default
event	string		Yes
data	threadObject	Represents a thread that contains messages.	Yes

Data: threadObject

Event Enum: ThreadStreamEventEnum

Value	Description
thread.created	The thread created event

runStreamEvent

This component can be one of the following:

thread.run.created

Occurs when a new run is created.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventCreated

Value	Description
thread.run.created

thread.run.queued

Occurs when a run moves to a queued status.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventQueued

Value	Description
thread.run.queued

thread.run.in_progress

Occurs when a run moves to an in_progress status.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventInProgress

Value	Description
thread.run.in_progress

thread.run.requires_action

Occurs when a run moves to a requires_action status.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventRequiresAction

Value	Description
thread.run.requires_action

thread.run.completed

Occurs when a run is completed.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventCompleted

Value	Description
thread.run.completed

thread.run.failed

Occurs when a run fails.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventFailed

Value	Description
thread.run.failed

thread.run.cancelling

Occurs when a run moves to a cancelling status.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventCancelling

Value	Description
thread.run.cancelling

thread.run.cancelled

Occurs when a run is cancelled.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventCancelled

Value	Description
thread.run.cancelled

thread.run.expired

Occurs when a run expires.

Name	Type	Description	Required	Default
event	string		Yes
data	runObject	Represents an execution run on a thread.	Yes

Data: runObject

Event Enum: RunStreamEventExpired

Value	Description
thread.run.expired

runStepStreamEvent

This component can be one of the following:

thread.run.step.created

Occurs when a run step is created.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventCreated

Value	Description
thread.run.step.created

thread.run.step.in_progress

Occurs when a run step moves to an in_progress state.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventInProgress

Value	Description
thread.run.step.in_progress

thread.run.step.delta

Occurs when parts of a run step are being streamed.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepDeltaObject	Represents a run step delta i.e. any changed fields on a run step during streaming.	Yes

Data: runStepDeltaObject

Event Enum: RunStepStreamEventDelta

Value	Description
thread.run.step.delta

thread.run.step.completed

Occurs when a run step is completed.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventCompleted

Value	Description
thread.run.step.completed

thread.run.step.failed

Occurs when a run step fails.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventFailed

Value	Description
thread.run.step.failed

thread.run.step.cancelled

Occurs when a run step is cancelled.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventCancelled

Value	Description
thread.run.step.cancelled

thread.run.step.expired

Occurs when a run step expires.

Name	Type	Description	Required	Default
event	string		Yes
data	runStepObject	Represents a step in execution of a run.	Yes

Data: runStepObject

Event Enum: RunStepStreamEventExpired

Value	Description
thread.run.step.expired

messageStreamEvent

This component can be one of the following:

thread.message.created

Occurs when a message is created.

Name	Type	Description	Required	Default
event	string		Yes
data	messageObject	Represents a message within a thread.	Yes

Data: messageObject

Event Enum: MessageStreamEventCreated

Value	Description
thread.message.created

thread.message.in_progress

Occurs when a message moves to an in_progress state.

Name	Type	Description	Required	Default
event	string		Yes
data	messageObject	Represents a message within a thread.	Yes

Data: messageObject

Event Enum: MessageStreamEventInProgress

Value	Description
thread.message.in_progress

thread.message.delta

Occurs when parts of a message are being streamed.

Name	Type	Description	Required	Default
event	string		Yes
data	messageDeltaObject	Represents a message delta i.e. any changed fields on a message during streaming.	Yes

Data: messageDeltaObject

Event Enum: MessageStreamEventDelta

Value	Description
thread.message.delta

thread.message.completed

Occurs when a message is completed.

Name	Type	Description	Required	Default
event	string		Yes
data	messageObject	Represents a message within a thread.	Yes

Data: messageObject

Event Enum: MessageStreamEventCompleted

Value	Description
thread.message.completed

thread.message.incomplete

Occurs when a message ends before it's completed.

Name	Type	Description	Required	Default
event	string		Yes
data	messageObject	Represents a message within a thread.	Yes

Data: messageObject

Event Enum: MessageStreamEventIncomplete

Value	Description
thread.message.incomplete

errorEvent

Occurs when an error occurs. This can happen due to an internal server error or a timeout.

Name	Type	Description	Required	Default
event	string		Yes
data	error		Yes

event Enum: ErrorEventEnum

Value	Description
error

doneEvent

Occurs when a stream ends.

Name	Type	Description	Required	Default
event	string		Yes
data	string		Yes

event Enum: DoneEventEnum

Value	Description
done

data Enum: DoneEventDataEnum

Value	Description
[DONE]

Next steps

Learn about Models, and fine-tuning with the REST API. Learn more about the underlying models that power Azure OpenAI.

Jagamisviis:

Azure OpenAI Service REST API preview reference

API specs

Authentication

REST API versioning

Data plane inference

Completions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Embeddings - Create

URI Parameters

Request Header

Request Body

Responses

Properties for usage

prompt_tokens

total_tokens

Examples

Example

Chat completions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Example

Example

Example

Example

Example

Example

Transcriptions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Translations - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Speech - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Image generations - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

List - Assistants

URI Parameters

Request Header

Responses

Examples

Example

Create - Assistant

URI Parameters

Request Header

Request Body

Properties for tool_resources

code_interpreter

file_search