Azure OpenAI in Azure AI Foundry Models REST API preview reference

2025-07-02

This article provides details on the inference REST API endpoints for Azure OpenAI.

API specs

Managing and interacting with Azure OpenAI models and resources is divided across three primary API surfaces:

Control plane
Data plane - authoring
Data plane - inference

Each API surface/specification encapsulates a different set of Azure OpenAI capabilities. Each API has its own unique set of preview and stable/generally available (GA) API releases. Preview releases currently tend to follow a monthly cadence.

Important

There is now a new preview inference API. Learn more in our API lifecycle guide.

API	Latest preview release	Latest GA release	Specifications	Description
Control plane	`2024-06-01-preview`	`2024-10-01`	Spec files	Azure OpenAI shares a common control plane with all other services. The control plane API is used for things like creating Azure OpenAI resources, model deployment, and other higher level resource management tasks. The control plane also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI.
Data plane - authoring	`2025-04-01-preview`	`2024-10-21`	Spec files	The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, batch and certain model level queries
Data plane - inference	`2025-04-01-preview`	`2024-10-21`	Spec files	The data plane inference API provides the inference capabilities/endpoints for features like completions, chat completions, embeddings, audio, on your data, Dall-e, assistants, etc.

Authentication

Azure OpenAI provides two methods for authentication. You can use either API Keys or Microsoft Entra ID.

API Key authentication: For this type of authentication, all API requests must include the API Key in the api-key HTTP header. The Quickstart provides guidance for how to make calls with this type of authentication.
Microsoft Entra ID authentication: You can authenticate an API call using a Microsoft Entra token. Authentication tokens are included in a request as the Authorization header. The token provided must be preceded by Bearer, for example Bearer YOUR_AUTH_TOKEN. You can read our how-to guide on authenticating with Microsoft Entra ID.

REST API versioning

The service APIs are versioned using the api-version query parameter. All versions follow the YYYY-MM-DD date structure. For example:

POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2024-06-01

Data plane inference

The rest of the article covers the 2025-04-01-preview preview release of the Azure OpenAI data plane inference specification.

If you're looking for documentation on the latest GA API release, refer to the latest GA data plane inference API

Completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-04-01-preview

Creates a completion for the provided prompt, parameters and chosen model.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
best_of	integer	Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed. When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return Ã¢â‚¬â€œ `best_of` must be greater than `n`. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
echo	boolean	Echo back the prompt in addition to the completion	No	False
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <\|endoftext\|> token from being generated.	No	None
logprobs	integer	Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response. The maximum value for `logprobs` is 5.	No	None
max_tokens	integer	The maximum number of tokensthat can be generated in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length.	No	16
n	integer	How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
prompt	string or array	The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <\|endoftext\|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.	Yes
seed	integer	If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.	No
stream	boolean	Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message. Example Python code.	No	False
suffix	string	The suffix that comes after a completion of inserted text. This parameter is only supported for `gpt-3.5-turbo-instruct`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	createCompletionResponse

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/completions?api-version=2025-04-01-preview

{
 "prompt": [
  "tell me a joke about mango"
 ],
 "max_tokens": 32,
 "temperature": 1.0,
 "n": 1
}

Responses: Status Code: 200

{
  "body": {
    "id": "cmpl-7QmVI15qgYVllxK0FtxVGG6ywfzaq",
    "created": 1686617332,
    "choices": [
      {
        "text": "es\n\nWhat do you call a mango who's in charge?\n\nThe head mango.",
        "index": 0,
        "finish_reason": "stop",
        "logprobs": null
      }
    ],
    "usage": {
      "completion_tokens": 20,
      "prompt_tokens": 6,
      "total_tokens": 26
    }
  }
}

Embeddings - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-04-01-preview

Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string	The deployment id of the model which was deployed.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
dimensions	integer	The number of dimensions the resulting output embeddings should have. Only supported in `text-embedding-3` and later models.	No
encoding_format	string	The format to return the embeddings in. Can be either `float` or `base64`. Defaults to `float`.	No
input	string or array	Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for `text-embedding-ada-002`), can't be an empty string, and any array must be 2048 dimensions or less.	Yes
input_type	string	input type of embedding search to use	No
user	string	A unique identifier representing your end-user, which can help monitoring and detecting abuse.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	object

Examples

Example

Return the embeddings for a given prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/embeddings?api-version=2025-04-01-preview

{
 "input": [
  "this is a test"
 ]
}

Responses: Status Code: 200

{
  "body": {
    "data": [
      {
        "index": 0,
        "embedding": [
          -0.012838088,
          -0.007421397,
          -0.017617522,
          -0.028278312,
          -0.018666342,
          0.01737855,
          -0.01821495,
          -0.006950092,
          -0.009937238,
          -0.038580645,
          0.010674067,
          0.02412286,
          -0.013647936,
          0.013189907,
          0.0021125758,
          0.012406612,
          0.020790534,
          0.00074595667,
          0.008397198,
          -0.00535031,
          0.008968075,
          0.014351576,
          -0.014086051,
          0.015055214,
          -0.022211088,
          -0.025198232,
          0.0065186154,
          -0.036350243,
          0.009180495,
          -0.009698266,
          0.009446018,
          -0.008463579,
          -0.0020113448
        ]
      }
    ],
    "usage": {
      "prompt_tokens": 4,
      "total_tokens": 4
    }
  }
}

Chat completions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

Creates a completion for the chat message

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
audio	object	Parameters for audio output. Required when audio output is requested with `modalities: ["audio"]`.	No
└─ format	enum	Specifies the output audio format. Must be one of `wav`, `mp3`, `flac`, `opus`, or `pcm16`. Possible values: `wav`, `mp3`, `flac`, `opus`, `pcm16`	No
└─ voice	enum	Specifies the voice type. Supported voices are `alloy`, `echo`, `fable`, `onyx`, `nova`, and `shimmer`. Possible values: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`	No
data_sources	array	The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.	No
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
function_call	string or chatCompletionFunctionCallOption	Deprecated in favor of `tool_choice`. Controls which (if any) function is called by the model. `none` means the model won't call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present.	No
functions	array	Deprecated in favor of `tools`. A list of functions the model may generate JSON inputs for.	No
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No	None
logprobs	boolean	Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.	No	False
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
max_tokens	integer	The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.	No
messages	array	A list of messages comprising the conversation so far. Example Python code.	Yes
metadata	object	Developer-defined tags and values used for filtering completions in the stored completions dashboard.	No
modalities	ChatCompletionModalities	Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default: `["text"]` The `gpt-4o-audio-preview` model can also be used to generate audio. To request that this model generate both text and audio responses, you can use: `["text", "audio"]`	No
n	integer	How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.	No	1
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
prediction	PredictionContent	Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
reasoning_effort	enum	o1 models only Constrains effort on reasoning for reasoning models. Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Possible values: `low`, `medium`, `high`	No
response_format	ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema	An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which guarantee the model will match your supplied JSON schema. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
seed	integer	This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to 4 sequences where the API will stop generating further tokens.	No
store	boolean	Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products.	No
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message. Example Python code.	No	False
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
tool_choice	chatCompletionToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.	No
tools	array	A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.	No
top_logprobs	integer	An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
user_security_context	userSecurityContext	User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	object

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Examples

Example

Creates a completion for the provided prompt, parameters and chosen model.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "system",
   "content": "you are a helpful assistant that talks like a pirate"
  },
  {
   "role": "user",
   "content": "can you tell me how to care for a parrot?"
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Ahoy matey! So ye be wantin' to care for a fine squawkin' parrot, eh? Well, shiver me timbers, let ol' Cap'n Assistant share some wisdom with ye! Here be the steps to keepin' yer parrot happy 'n healthy:\n\n1. Secure a sturdy cage: Yer parrot be needin' a comfortable place to lay anchor! Be sure ye get a sturdy cage, at least double the size of the bird's wingspan, with enough space to spread their wings, yarrrr!\n\n2. Perches 'n toys: Aye, parrots need perches of different sizes, shapes, 'n textures to keep their feet healthy. Also, a few toys be helpin' to keep them entertained 'n their minds stimulated, arrrh!\n\n3. Proper grub: Feed yer feathered friend a balanced diet of high-quality pellets, fruits, 'n veggies to keep 'em strong 'n healthy. Give 'em fresh water every day, or ye\u00e2\u20ac\u2122ll have a scurvy bird on yer hands!\n\n4. Cleanliness: Swab their cage deck! Clean their cage on a regular basis: fresh water 'n food daily, the floor every couple of days, 'n a thorough scrubbing ev'ry few weeks, so the bird be livin' in a tidy haven, arrhh!\n\n5. Socialize 'n train: Parrots be a sociable lot, arrr! Exercise 'n interact with 'em daily to create a bond 'n maintain their mental 'n physical health. Train 'em with positive reinforcement, treat 'em kindly, yarrr!\n\n6. Proper rest: Yer parrot be needin' \u00e2\u20ac\u2122bout 10-12 hours o' sleep each night. Cover their cage 'n let them slumber in a dim, quiet quarter for a proper night's rest, ye scallywag!\n\n7. Keep a weather eye open for illness: Birds be hidin' their ailments, arrr! Be watchful for signs of sickness, such as lethargy, loss of appetite, puffin' up, or change in droppings, and make haste to a vet if need be.\n\n8. Provide fresh air 'n avoid toxins: Parrots be sensitive to draft and pollutants. Keep yer quarters well ventilated, but no drafts, arrr! Be mindful of toxins like Teflon fumes, candles, or air fresheners.\n\nSo there ye have it, me hearty! With proper care 'n commitment, yer parrot will be squawkin' \"Yo-ho-ho\" for many years to come! Good luck, sailor, and may the wind be at yer back!"
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search data and system-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "authentication": {
     "type": "system_assigned_managed_identity"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search image vector data.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "index_name": "{index name}",
    "query_type": "vector",
    "fields_mapping": {
     "image_vector_fields": [
      "image_vector"
     ]
    },
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion."
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion based on Azure Search vector data, previous assistant message and user-assigned managed identity.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a cat?"
  },
  {
   "role": "assistant",
   "content": "Content of the completion [doc1].",
   "context": {
    "intent": "cat care"
   }
  },
  {
   "role": "user",
   "content": "how about dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_search",
   "parameters": {
    "endpoint": "https://your-search-endpoint.search.windows.net/",
    "authentication": {
     "type": "user_assigned_managed_identity",
     "managed_identity_resource_id": "/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{resource-name}"
    },
    "index_name": "{index name}",
    "query_type": "vector",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "in_scope": true,
    "top_n_documents": 5,
    "strictness": 3,
    "role_information": "You are an AI assistant that helps people find information.",
    "fields_mapping": {
     "content_fields_separator": "\\n",
     "content_fields": [
      "content"
     ],
     "filepath_field": "filepath",
     "title_field": "title",
     "url_field": "url",
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content 2.",
                "title": "Citation Title 2",
                "filepath": "contoso2.txt",
                "url": "https://contoso.blob.windows.net/container/contoso2.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Azure Cosmos DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "azure_cosmos_db",
   "parameters": {
    "authentication": {
     "type": "connection_string",
     "connection_string": "mongodb+srv://rawantest:{password}$@{cluster-name}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000"
    },
    "database_name": "vectordb",
    "container_name": "azuredocs",
    "index_name": "azuredocindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Mongo DB.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "mongo_db",
   "parameters": {
    "authentication": {
     "type": "username_and_password",
     "username": "<username>",
     "password": "<password>"
    },
    "endpoint": "<endpoint_name>",
    "app_name": "<application name>",
    "database_name": "sampledb",
    "collection_name": "samplecollection",
    "index_name": "sampleindex",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "content_fields": [
      "content"
     ],
     "vector_fields": [
      "contentvector"
     ]
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Elasticsearch.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "elasticsearch",
   "parameters": {
    "endpoint": "https://your-elasticsearch-endpoint.eastus.azurecontainer.io",
    "index_name": "{index name}",
    "authentication": {
     "type": "key_and_key_id",
     "key": "{key}",
     "key_id": "{key id}"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Example

Creates a completion for the provided Pinecone resource.

POST https://{endpoint}/openai/deployments/{deployment-id}/chat/completions?api-version=2025-04-01-preview

{
 "messages": [
  {
   "role": "user",
   "content": "can you tell me how to care for a dog?"
  }
 ],
 "data_sources": [
  {
   "type": "pinecone",
   "parameters": {
    "authentication": {
     "type": "api_key",
     "key": "{api key}"
    },
    "environment": "{environment name}",
    "index_name": "{index name}",
    "embedding_dependency": {
     "type": "deployment_name",
     "deployment_name": "{embedding deployment name}"
    },
    "fields_mapping": {
     "title_field": "title",
     "url_field": "url",
     "filepath_field": "filepath",
     "content_fields": [
      "content"
     ],
     "content_fields_separator": "\n"
    }
   }
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "chatcmpl-7R1nGnsXO8n4oi9UPz2f3UHdgAYMn",
    "created": 1686676106,
    "choices": [
      {
        "index": 0,
        "finish_reason": "stop",
        "message": {
          "role": "assistant",
          "content": "Content of the completion [doc1].",
          "context": {
            "citations": [
              {
                "content": "Citation content.",
                "title": "Citation Title",
                "filepath": "contoso.txt",
                "url": "https://contoso.blob.windows.net/container/contoso.txt",
                "chunk_id": "0"
              }
            ],
            "intent": "dog care"
          }
        }
      }
    ],
    "usage": {
      "completion_tokens": 557,
      "prompt_tokens": 33,
      "total_tokens": 590
    }
  }
}

Transcriptions - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-04-01-preview

Transcribes audio into the input language.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
file	string	The audio file object to transcribe.	Yes
language	string	The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.	No
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0
timestamp_granularities[]	array	The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.	No	['segment']

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	object
text/plain	string	Transcribed text in the output format (when response_format was one of `text`, `vtt` or `srt`).

Examples

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/transcriptions?api-version=2025-04-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Translations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-04-01-preview

Transcribes and translates input audio into English text.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
file	string	The audio file to translate.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	object
text/plain	string	Transcribed text in the output format (when response_format was one of text, vtt or srt).

Examples

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-04-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "body": {
    "text": "A structured object when requesting json or verbose_json"
  }
}

Example

Gets English language transcribed text and associated metadata from provided spoken audio data.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/translations?api-version=2025-04-01-preview

"---multipart-boundary\nContent-Disposition: form-data; name=\"file\"; filename=\"file.wav\"\nContent-Type: application/octet-stream\n\nRIFF..audio.data.omitted\n---multipart-boundary--"

Responses: Status Code: 200

{
  "type": "string",
  "example": "plain text when requesting text, srt, or vtt"
}

Speech - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-04-01-preview

Generates audio from the input text.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
input	string	The text to synthesize audio for. The maximum length is 4096 characters.	Yes
response_format	enum	The format to synthesize the audio in. Possible values: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`	No
speed	number	The speed of the synthesized audio. Select a value from `0.25` to `4.0`. `1.0` is the default.	No	1.0
voice	enum	The voice to use for speech synthesis. Possible values: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/octet-stream	string

Examples

Example

Synthesizes audio from the provided text.

POST https://{endpoint}/openai/deployments/{deployment-id}/audio/speech?api-version=2025-04-01-preview

{
 "input": "Hi! What are you going to make?",
 "voice": "fable",
 "response_format": "mp3"
}

Responses: Status Code: 200

{
  "body": "101010101"
}

Image generations - Create

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-04-01-preview

Generates a batch of images from a text caption on a given DALLE or gpt-image-1 model deployment

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
background	imageBackground	Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1.	No	auto
n	integer	The number of images to generate. For dall-e-3, only n=1 is supported.	No	1
output_compression	integer	The compression level (0-100%) for the generated images. This parameter is only supported for gpt-image-1 with the jpeg output format.	No	100
output_format	imagesOutputFormat	The file format in which the generated images are returned. Only supported for gpt-image-1.	No	png
prompt	string	A text description of the desired image(s). The maximum length is 32000 characters for gpt-image-1 and 4000 characters for dall-e-3	Yes
quality	imageQuality	The quality of the image that will be generated.	No	auto
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
size	imageSize	The size of the generated images.	No	auto
style	imageStyle	The style of the generated images. Only supported for dall-e-3.	No	vivid
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

Responses

Status Code: 200

Description: Ok

Content-Type	Type	Description
application/json	generateImagesResponse

Status Code: default

Description: An error occurred.

Content-Type	Type	Description
application/json	dalleErrorResponse

Examples

Example

Creates images given a prompt.

POST https://{endpoint}/openai/deployments/{deployment-id}/images/generations?api-version=2025-04-01-preview

{
 "prompt": "In the style of WordArt, Microsoft Clippy wearing a cowboy hat.",
 "n": 1,
 "style": "natural",
 "quality": "standard"
}

Responses: Status Code: 200

{
  "body": {
    "created": 1698342300,
    "data": [
      {
        "revised_prompt": "A vivid, natural representation of Microsoft Clippy wearing a cowboy hat.",
        "prompt_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          },
          "profanity": {
            "detected": false,
            "filtered": false
          },
          "custom_blocklists": {
            "filtered": false,
            "details": []
          }
        },
        "url": "https://dalletipusw2.blob.core.windows.net/private/images/e5451cc6-b1ad-4747-bd46-b89a3a3b8bc3/generated_00.png?se=2023-10-27T17%3A45%3A09Z&...",
        "content_filter_results": {
          "sexual": {
            "severity": "safe",
            "filtered": false
          },
          "violence": {
            "severity": "safe",
            "filtered": false
          },
          "hate": {
            "severity": "safe",
            "filtered": false
          },
          "self_harm": {
            "severity": "safe",
            "filtered": false
          }
        }
      }
    ]
  }
}

Image generations - Edit

POST https://{endpoint}/openai/deployments/{deployment-id}/images/edits?api-version=2025-04-01-preview

Edits an image from a text caption on a given gpt-image-1 model deployment

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
deployment-id	path	Yes	string
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: multipart/form-data

Name	Type	Description	Required	Default
image	string or array	The image(s) to edit. Must be a supported image file or an array of images. Each image should be a png, or jpg file less than 25MB.	Yes
mask	string	An additional image whose fully transparent areas (e.g., where alpha is zero) indicate where the image should be edited. If there are multiple images provided, the mask will be applied to the first image. Must be a valid PNG file, less than 4MB, and have the same dimensions as the image.	No
n	integer	The number of images to generate.	No	1
prompt	string	A text description of the desired image(s). The maximum length is 32000 characters.	Yes
quality	imageQuality	The quality of the image that will be generated.	No	auto
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
size	imageSize	The size of the generated images.	No	auto
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

Responses

Status Code: 200

Description: Ok

Content-Type	Type	Description
application/json	generateImagesResponse

Status Code: default

Description: An error occurred.

Content-Type	Type	Description
application/json	dalleErrorResponse

List - Assistants

GET https://{endpoint}/openai/assistants?api-version=2025-04-01-preview

Returns a list of assistants.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listAssistantsResponse

Examples

Example

Returns a list of assistants.

GET https://{endpoint}/openai/assistants?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "asst_abc123",
        "object": "assistant",
        "created_at": 1707257477,
        "name": "Stock Analyst",
        "description": null,
        "model": "gpt-4-1106-preview",
        "instructions": "You are a financial analyst that analyzes stock market prices and other financial data present on user uploaded files or by calling external APIs.",
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc456",
        "object": "assistant",
        "created_at": 1698982718,
        "name": "My Assistant",
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": "You are a helpful assistant designed to make me better at coding!",
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      },
      {
        "id": "asst_abc789",
        "object": "assistant",
        "created_at": 1698982643,
        "name": null,
        "description": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "tools": [],
        "tool_resources": {},
        "metadata": {},
        "top_p": 1.0,
        "temperature": 1.0,
        "response_format": "auto"
      }
    ],
    "first_id": "asst_abc123",
    "last_id": "asst_abc789",
    "has_more": false
  }
}

Create - Assistant

POST https://{endpoint}/openai/assistants?api-version=2025-04-01-preview

Create an assistant with a model and instructions.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string		Yes
name	string	The name of the assistant. The maximum length is 256 characters.	No
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
└─ vector_stores	array	A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject

Examples

Example

Create an assistant with a model and instructions.

POST https://{endpoint}/openai/assistants?api-version=2025-04-01-preview

{
 "name": "Math Tutor",
 "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
 "tools": [
  {
   "type": "code_interpreter"
  }
 ],
 "model": "gpt-4-1106-preview"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant",
    "created_at": 1707295707,
    "name": "Math Tutor",
    "description": null,
    "model": "gpt-4-1106-preview",
    "instructions": "When a customer asks about a specific math problem, use Python to evaluate their query.",
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Get - Assistant

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

Retrieves an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string	The ID of the assistant to retrieve.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject

Examples

Example

Retrieves an assistant.

GET https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_abc123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Modify - Assistant

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

Modifies an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string	The ID of the assistant to modify.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 32768 characters.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string		No
name	string	The name of the assistant. The maximum length is 256 characters.	No
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	Overrides the list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	assistantObject

Examples

Example

Modifies an assistant.

POST https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

{
 "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
 "tools": [
  {
   "type": "file_search"
  }
 ],
 "model": "gpt-4-turbo"
}

Responses: Status Code: 200

{
  "body": {
    "id": "asst_123",
    "object": "assistant",
    "created_at": 1699009709,
    "name": "HR Helper",
    "description": null,
    "model": "gpt-4-turbo",
    "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": []
      }
    },
    "metadata": {},
    "top_p": 1.0,
    "temperature": 1.0,
    "response_format": "auto"
  }
}

Delete - Assistant

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

Delete an assistant.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
assistant_id	path	Yes	string	The ID of the assistant to delete.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteAssistantResponse

Examples

Example

Deletes an assistant.

DELETE https://{endpoint}/openai/assistants/{assistant_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "asst_4nsG2qgNzimRPE7MazXTXbU7",
    "object": "assistant.deleted",
    "deleted": true
  }
}

Create - Thread

POST https://{endpoint}/openai/threads?api-version=2025-04-01-preview

Create a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
messages	array	A list of messagesto start the thread with.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.	No
└─ vector_stores	array	A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject

Examples

Example

Creates a thread.

POST https://{endpoint}/openai/threads?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {}
  }
}

Get - Thread

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

Retrieves a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to retrieve.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject

Examples

Example

Retrieves a thread.

GET https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {},
    "tool_resources": {
      "code_interpreter": {
        "file_ids": []
      }
    }
  }
}

Modify - Thread

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

Modifies a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to modify. Only the `metadata` can be modified.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of File IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	threadObject

Examples

Example

Modifies a thread.

POST https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread",
    "created_at": 1707297136,
    "metadata": {
      "modified": "true",
      "user": "abc123"
    },
    "tool_resources": {}
  }
}

Delete - Thread

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

Delete a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to delete.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteThreadResponse

Examples

Example

Deletes a thread.

DELETE https://{endpoint}/openai/threads/{thread_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "object": "thread.deleted",
    "deleted": true
  }
}

List - Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-04-01-preview

Returns a list of messages for a given thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads the messages belong to.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
run_id	query	No	string	Filter messages by the run ID that generated them.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listMessagesResponse

Examples

Example

List Messages

GET https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "msg_abc123",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "How does AI work? Explain it in simple terms.",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      },
      {
        "id": "msg_abc456",
        "object": "thread.message",
        "created_at": 1699016383,
        "assistant_id": null,
        "thread_id": "thread_abc123",
        "run_id": null,
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": {
              "value": "Hello, what is AI?",
              "annotations": []
            }
          }
        ],
        "attachments": [],
        "metadata": {}
      }
    ],
    "first_id": "msg_abc123",
    "last_id": "msg_abc456",
    "has_more": false
  }
}

Create - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-04-01-preview

Create a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads to create a message for.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
attachments	array	A list of files attached to the message, and the tools they should be added to.	No
content	string	The content of the message.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
role	string	The role of the entity that is creating the message. Allowed values include: - `user`: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. - `assistant`: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject

Examples

Example

Create a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages?api-version=2025-04-01-preview

{
 "role": "user",
 "content": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces."
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "assistant_id": null,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "attachments": [],
    "metadata": {}
  }
}

Get - Message

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-04-01-preview

Retrieve a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads to which this message belongs.
message_id	path	Yes	string	The ID of the message to retrieve.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject

Examples

Example

Retrieve a message.

GET https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "msg_as3XIk1tpVP3hdHjWBGg3uG4",
    "object": "thread.message",
    "created_at": 1707298421,
    "thread_id": "thread_v7V4csrNOxtNmgcwGg496Smx",
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "What is the cube root of the sum of 12, 14, 1234, 4321, 90000, 123213541223, 443123123124, 5423324234, 234324324234, 653434534545, 200000000, 98237432984, 99999999, 99999999999, 220000000000, 3309587702? Give me the answer rounded to the nearest integer without commas or spaces.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "assistant_id": null,
    "run_id": null,
    "metadata": {}
  }
}

Modify - Message

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-04-01-preview

Modifies a message.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to which this message belongs.
message_id	path	Yes	string	The ID of the message to modify.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	messageObject

Examples

Example

Modify a message.

POST https://{endpoint}/openai/threads/{thread_id}/messages/{message_id}?api-version=2025-04-01-preview

{
 "metadata": {
  "modified": "true",
  "user": "abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "msg_abc123",
    "object": "thread.message",
    "created_at": 1699017614,
    "assistant_id": null,
    "thread_id": "thread_abc123",
    "run_id": null,
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": {
          "value": "How does AI work? Explain it in simple terms.",
          "annotations": []
        }
      }
    ],
    "file_ids": [],
    "metadata": {
      "modified": "true",
      "user": "abc123"
    }
  }
}

Create - Thread And Run

POST https://{endpoint}/openai/threads/runs?api-version=2025-04-01-preview

Create a thread and run it in one request.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string	The ID of the models to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
thread	createThreadRequest		No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

Create a thread and run it in one request.

POST https://{endpoint}/openai/threads/runs?api-version=2025-04-01-preview

{
 "assistant_id": "asst_abc123",
 "thread": {
  "messages": [
   {
    "role": "user",
    "content": "Explain deep learning to a 5 year old."
   }
  ]
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076792,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": null,
    "expires_at": 1699077392,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "required_action": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You are a helpful assistant.",
    "tools": [],
    "tool_resources": {},
    "metadata": {},
    "temperature": 1.0,
    "top_p": 1.0,
    "max_completion_tokens": null,
    "max_prompt_tokens": null,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "incomplete_details": null,
    "usage": null,
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

List - Runs

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-04-01-preview

Returns a list of runs belonging to a thread.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread the run belongs to.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listRunsResponse

Examples

Example

Returns a list of runs belonging to a thread.

GET https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "run_abc123",
        "object": "thread.run",
        "created_at": 1699075072,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699075072,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699075073,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      },
      {
        "id": "run_abc456",
        "object": "thread.run",
        "created_at": 1699063290,
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "status": "completed",
        "started_at": 1699063290,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1699063291,
        "last_error": null,
        "model": "gpt-4-turbo",
        "instructions": null,
        "incomplete_details": null,
        "tools": [
          {
            "type": "code_interpreter"
          }
        ],
        "tool_resources": {
          "code_interpreter": {
            "file_ids": [
              "file-abc123",
              "file-abc456"
            ]
          }
        },
        "metadata": {},
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        },
        "temperature": 1.0,
        "top_p": 1.0,
        "max_prompt_tokens": 1000,
        "max_completion_tokens": 1000,
        "truncation_strategy": {
          "type": "auto",
          "last_messages": null
        },
        "response_format": "auto",
        "tool_choice": "auto"
      }
    ],
    "first_id": "run_abc123",
    "last_id": "run_abc456",
    "has_more": false
  }
}

Create - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-04-01-preview

Create a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to run.
include[]	query	No	array	A list of additional fields to include in the response. Currently the only supported value is `step_details.tool_calls[].file_search.results[].content` to fetch the file search result content.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
additional_instructions	string	Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.	No
additional_messages	array	Adds additional messages to the thread before creating the run.	No
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string	The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model won't call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

Create a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs?api-version=2025-04-01-preview

{
 "assistant_id": "asst_abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699063290,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "queued",
    "started_at": 1699063290,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699063291,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Get - Run

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-04-01-preview

Retrieves a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads that was run.
run_id	path	Yes	string	The ID of the run to retrieve.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

Gets a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_HsO8tYM4K5AAMAHgK0J3om8Q",
    "object": "thread.run",
    "created_at": 1707303196,
    "assistant_id": "asst_JtTwHk28cIocgFXZPCBxhOzl",
    "thread_id": "thread_eRNwflE3ncDYak1np6MdMHJh",
    "status": "completed",
    "started_at": 1707303197,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1707303201,
    "last_error": null,
    "model": "gpt-4-1106-preview",
    "instructions": "You are an AI model that empowers every person and every organization on the planet to achieve more.",
    "tools": [],
    "file_ids": [],
    "metadata": {}
  }
}

Modify - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-04-01-preview

Modifies a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads that was run.
run_id	path	Yes	string	The ID of the run to modify.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

Modifies a run.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}?api-version=2025-04-01-preview

{
 "metadata": {
  "user_id": "user_abc123"
 }
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699075072,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "completed",
    "started_at": 1699075072,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1699075073,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "incomplete_details": null,
    "tools": [
      {
        "type": "code_interpreter"
      }
    ],
    "tool_resources": {
      "code_interpreter": {
        "file_ids": [
          "file-abc123",
          "file-abc456"
        ]
      }
    },
    "metadata": {
      "user_id": "user_abc123"
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    },
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Submit - Tool Outputs To Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-04-01-preview

When a run has the status: "requires_action" and required_action.type is submit_tool_outputs, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the threads to which this run belongs.
run_id	path	Yes	string	The ID of the run that requires the tool output submission.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
tool_outputs	array	A list of tools for which the outputs are being submitted.	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/submit_tool_outputs?api-version=2025-04-01-preview

{
 "tool_outputs": [
  {
   "tool_call_id": "call_001",
   "output": "70 degrees and sunny."
  }
 ]
}

Responses: Status Code: 200

{
  "body": {
    "id": "run_123",
    "object": "thread.run",
    "created_at": 1699075592,
    "assistant_id": "asst_123",
    "thread_id": "thread_123",
    "status": "queued",
    "started_at": 1699075592,
    "expires_at": 1699076192,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": null,
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": [
                  "celsius",
                  "fahrenheit"
                ]
              }
            },
            "required": [
              "location"
            ]
          }
        }
      }
    ],
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "max_prompt_tokens": 1000,
    "max_completion_tokens": 1000,
    "truncation_strategy": {
      "type": "auto",
      "last_messages": null
    },
    "response_format": "auto",
    "tool_choice": "auto"
  }
}

Cancel - Run

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-04-01-preview

Cancels a run that is in_progress.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to which this run belongs.
run_id	path	Yes	string	The ID of the run to cancel.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runObject

Examples

Example

Cancels a run that is in_progress.

POST https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/cancel?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "run_abc123",
    "object": "thread.run",
    "created_at": 1699076126,
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "status": "cancelling",
    "started_at": 1699076126,
    "expires_at": 1699076726,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": null,
    "last_error": null,
    "model": "gpt-4-turbo",
    "instructions": "You summarize books.",
    "tools": [
      {
        "type": "file_search"
      }
    ],
    "tool_resources": {
      "file_search": {
        "vector_store_ids": [
          "vs_123"
        ]
      }
    },
    "metadata": {},
    "usage": null,
    "temperature": 1.0,
    "top_p": 1.0,
    "response_format": "auto"
  }
}

List - Run Steps

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-04-01-preview

Returns a list of run steps belonging to a run.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread the run and run steps belong to.
run_id	path	Yes	string	The ID of the run the run steps belong to.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version	query	Yes	string
include[]	query	No	array	A list of additional fields to include in the response. Currently the only supported value is `step_details.tool_calls[].file_search.results[].content` to fetch the file search result content.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listRunStepsResponse

Examples

Example

Returns a list of run steps belonging to a run.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "step_abc123",
        "object": "thread.run.step",
        "created_at": 1699063291,
        "run_id": "run_abc123",
        "assistant_id": "asst_abc123",
        "thread_id": "thread_abc123",
        "type": "message_creation",
        "status": "completed",
        "cancelled_at": null,
        "completed_at": 1699063291,
        "expired_at": null,
        "failed_at": null,
        "last_error": null,
        "step_details": {
          "type": "message_creation",
          "message_creation": {
            "message_id": "msg_abc123"
          }
        },
        "usage": {
          "prompt_tokens": 123,
          "completion_tokens": 456,
          "total_tokens": 579
        }
      }
    ],
    "first_id": "step_abc123",
    "last_id": "step_abc456",
    "has_more": false
  }
}

Get - Run Step

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-04-01-preview

Retrieves a run step.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
thread_id	path	Yes	string	The ID of the thread to which the run and run step belongs.
run_id	path	Yes	string	The ID of the run to which the run step belongs.
step_id	path	Yes	string	The ID of the run step to retrieve.
include[]	query	No	array	A list of additional fields to include in the response. Currently the only supported value is `step_details.tool_calls[].file_search.results[].content` to fetch the file search result content.
api-version	query	Yes	string

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	runStepObject

Examples

Example

Retrieves a run step.

GET https://{endpoint}/openai/threads/{thread_id}/runs/{run_id}/steps/{step_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "step_abc123",
    "object": "thread.run.step",
    "created_at": 1699063291,
    "run_id": "run_abc123",
    "assistant_id": "asst_abc123",
    "thread_id": "thread_abc123",
    "type": "message_creation",
    "status": "completed",
    "cancelled_at": null,
    "completed_at": 1699063291,
    "expired_at": null,
    "failed_at": null,
    "last_error": null,
    "step_details": {
      "type": "message_creation",
      "message_creation": {
        "message_id": "msg_abc123"
      }
    },
    "usage": {
      "prompt_tokens": 123,
      "completion_tokens": 456,
      "total_tokens": 579
    }
  }
}

List - Vector Stores

GET https://{endpoint}/openai/vector_stores?api-version=2025-04-01-preview

Returns a list of vector stores.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoresResponse

Examples

Example

Returns a list of vector stores.

GET https://{endpoint}/openai/vector_stores?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "vs_abc123",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      },
      {
        "id": "vs_abc456",
        "object": "vector_store",
        "created_at": 1699061776,
        "name": "Support FAQ v2",
        "bytes": 139920,
        "file_counts": {
          "in_progress": 0,
          "completed": 3,
          "failed": 0,
          "cancelled": 0,
          "total": 3
        }
      }
    ],
    "first_id": "vs_abc123",
    "last_id": "vs_abc456",
    "has_more": false
  }
}

Create - Vector Store

POST https://{endpoint}/openai/vector_stores?api-version=2025-04-01-preview

Create a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is non-empty.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
file_ids	array	A list of file IDs that the vector store should use. Useful for tools like `file_search` that can access files.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
name	string	The name of the vector store.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject

Examples

Example

Creates a vector store.

POST https://{endpoint}/openai/vector_stores?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Get - Vector Store

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

Retrieves a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store to retrieve.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject

Examples

Example

Retrieves a vector store.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776
  }
}

Modify - Vector Store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

Modifies a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store to modify.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
name	string	The name of the vector store.	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreObject

Examples

Example

Modifies a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

{
 "name": "Support FAQ"
}

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store",
    "created_at": 1699061776,
    "name": "Support FAQ",
    "bytes": 139920,
    "file_counts": {
      "in_progress": 0,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 3
    }
  }
}

Delete - Vector Store

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

Delete a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store to delete.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteVectorStoreResponse

Examples

Example

Deletes a vector store.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vs_abc123",
    "object": "vector_store.deleted",
    "deleted": true
  }
}

List - Vector Store Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-04-01-preview

Returns a list of vector store files.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the files belong to.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
filter	query	No	string Possible values: `in_progress`, `completed`, `failed`, `cancelled`	Filter by file status. One of `in_progress`, `completed`, `failed`, `cancelled`.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Create - Vector Store File

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-04-01-preview

Create a vector store file by attaching a File to a vector store.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store for which to create a File.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No
file_id	string	A File ID that the vector store should use. Useful for tools like `file_search` that can access files.	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileObject

Examples

Example

Create a vector store file by attaching a File to a vector store.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files?api-version=2025-04-01-preview

{
 "file_id": "file-abc123"
}

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "usage_bytes": 1234,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Get - Vector Store File

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-04-01-preview

Retrieves a vector store file.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the file belongs to.
file_id	path	Yes	string	The ID of the file being retrieved.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileObject

Examples

Example

Retrieves a vector store file.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file-abc123",
    "object": "vector_store.file",
    "created_at": 1699061776,
    "vector_store_id": "vs_abcd",
    "status": "completed",
    "last_error": null
  }
}

Delete - Vector Store File

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-04-01-preview

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the file belongs to.
file_id	path	Yes	string	The ID of the file to delete.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	deleteVectorStoreFileResponse

Examples

Example

Delete a vector store file. This will remove the file from the vector store but the file itself won't be deleted. To delete the file, use the delete file endpoint.

DELETE https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "file_abc123",
    "object": "vector_store.file.deleted",
    "deleted": true
  }
}

Updatevectorstorefileattributes

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}?api-version=2025-04-01-preview

Update attributes on a vector store file.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store the file belongs to.
file_id	path	Yes	string	The ID of the file to update attributes.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
attributes	VectorStoreFileAttributes	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileObject

Retrieve vector store file content

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/files/{file_id}/content?api-version=2025-04-01-preview

Retrieve the parsed contents of a vector store file.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store.
file_id	path	Yes	string	The ID of the file within the vector store.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	VectorStoreFileContentResponse

Search vector store

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/search?api-version=2025-04-01-preview

Search a vector store for relevant chunks based on a query and file attributes filter.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store to search.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
filters	ComparisonFilter or CompoundFilter	A filter to apply based on file attributes.	No
max_num_results	integer	The maximum number of results to return. This number should be between 1 and 50 inclusive.	No	10
query	string or array	A query string for a search	Yes
ranking_options	object	Ranking options for search.	No
└─ ranker	enum	Possible values: `auto`, `default-2024-11-15`	No
└─ score_threshold	number		No	0
rewrite_query	boolean	Whether to rewrite the natural language query for vector search.	No	False

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	VectorStoreSearchResultsPage

Create - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-04-01-preview

Create a vector store file batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store for which to create a File Batch.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No
file_ids	array	A list of File IDs that the vector store should use. Useful for tools like `file_search` that can access files.	Yes

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject

Examples

Example

Create a vector store file batch.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches?api-version=2025-04-01-preview

{
 "file_ids": [
  "file-abc123",
  "file-abc456"
 ]
}

Responses: Status Code: 200

{
  "id": "vsfb_abc123",
  "object": "vector_store.file_batch",
  "created_at": 1699061776,
  "vector_store_id": "vs_abc123",
  "status": "in_progress",
  "file_counts": {
    "in_progress": 1,
    "completed": 1,
    "failed": 0,
    "cancelled": 0,
    "total": 0
  }
}

Get - Vector Store File Batch

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-04-01-preview

Retrieves a vector store file batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the file batch belongs to.
batch_id	path	Yes	string	The ID of the file batch being retrieved.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject

Examples

Example

Retrieves a vector store file batch.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "in_progress",
    "file_counts": {
      "in_progress": 1,
      "completed": 1,
      "failed": 0,
      "cancelled": 0,
      "total": 0
    }
  }
}

Cancel - Vector Store File Batch

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-04-01-preview

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the file batch belongs to.
batch_id	path	Yes	string	The ID of the file batch to cancel.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	vectorStoreFileBatchObject

Examples

Example

Cancel a vector store file batch. This attempts to cancel the processing of files in this batch as soon as possible.

POST https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/cancel?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "id": "vsfb_abc123",
    "object": "vector_store.file_batch",
    "created_at": 1699061776,
    "vector_store_id": "vs_abc123",
    "status": "cancelling",
    "file_counts": {
      "in_progress": 12,
      "completed": 3,
      "failed": 0,
      "cancelled": 0,
      "total": 15
    }
  }
}

List - Vector Store File Batch Files

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-04-01-preview

Returns a list of vector store files in a batch.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
vector_store_id	path	Yes	string	The ID of the vector store that the files belong to.
batch_id	path	Yes	string	The ID of the file batch that the files belong to.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order.
after	query	No	string	A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
before	query	No	string	A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, starting with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
filter	query	No	string Possible values: `in_progress`, `completed`, `failed`, `cancelled`	Filter by file status. One of `in_progress`, `completed`, `failed`, `cancelled`.
api-version	query	Yes	string	api version

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	listVectorStoreFilesResponse

Examples

Example

Returns a list of vector store files.

GET https://{endpoint}/openai/vector_stores/{vector_store_id}/file_batches/{batch_id}/files?api-version=2025-04-01-preview

Responses: Status Code: 200

{
  "body": {
    "object": "list",
    "data": [
      {
        "id": "file-abc123",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      },
      {
        "id": "file-abc456",
        "object": "vector_store.file",
        "created_at": 1699061776,
        "vector_store_id": "vs_abc123"
      }
    ],
    "first_id": "file-abc123",
    "last_id": "file-abc456",
    "has_more": false
  }
}

Create - Realtimesession

POST https://{endpoint}/openai/realtimeapi/sessions?api-version=2025-04-01-preview

Create an ephemeral API token for use in client-side applications with the Realtime API. Can be configured with the same session parameters as the session.update client event. It responds with a session object, plus a client_secret key which contains a usable ephemeral API token that can be used to authenticate browser clients for the Realtime API.

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
input_audio_format	enum	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, input audio must be 16-bit PCM at a 24kHz sample rate, single channel (mono), and little-endian byte order. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
input_audio_noise_reduction	object	Configuration for input audio noise reduction. This can be set to `null` to turn off. Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model. Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.	No
└─ type	enum	Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones. Possible values: `near_field`, `far_field`	No
input_audio_transcription	object	Configuration for input audio transcription, defaults to off and can be set to `null` to turn off once on. Input audio transcription isn't native to the model, since the model consumes audio directly. Transcription runs asynchronously through the Transcriptions endpoint and should be treated as guidance of input audio content rather than precisely what the model heard. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.	No
└─ language	string	The language of the input audio. Supplying the input language in ISO-639-1 (e.g. `en`) format will improve accuracy and latency.	No
└─ model	string	The model to use for transcription, current options are `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, and `whisper-1`.	No
└─ prompt	string	An optional text to guide the model's style or continue a previous audio segment. For `whisper-1`, the prompt is a list of keywords. For `gpt-4o-transcribe` models, the prompt is a free text string, for example "expect words related to technology".	No
instructions	string	The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior. Note that the server sets default instructions which will be used if this field isn't set and are visible in the `session.created` event at the start of the session.	No
max_response_output_tokens	integer or string	Maximum number of output tokens for a single assistant response, inclusive of tool calls. Provide an integer between 1 and 4096 to limit output tokens, or `inf` for the maximum available tokens for a given model. Defaults to `inf`.	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
model	string	The name of the deployment used for this session.	No
output_audio_format	enum	The format of output audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, output audio is sampled at a rate of 24kHz. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
temperature	number	Sampling temperature for the model, limited to [0.6, 1.2]. For audio models a temperature of 0.8 is highly recommended for best performance.	No	0.8
tool_choice	string	How the model chooses tools. Options are `auto`, `none`, `required`, or specify a function.	No	auto
tools	array	Tools (functions) available to the model.	No
turn_detection	object	Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech. Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with `uhhm`, the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.	No
└─ create_response	boolean	Whether or not to automatically generate a response when a VAD stop event occurs.	No	True
└─ eagerness	enum	Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. Possible values: `low`, `medium`, `high`, `auto`	No
└─ interrupt_response	boolean	Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e. `conversation` of `auto`) when a VAD start event occurs.	No	True
└─ prefix_padding_ms	integer	Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	enum	Type of turn detection. Possible values: `server_vad`, `semantic_vad`	No
voice	VoiceIdsShared		No

Responses

Status Code: 200

Description: Session created successfully.

Content-Type	Type	Description
application/json	RealtimeSessionCreateResponse

Create - Transcriptionrealtimesession

POST https://{endpoint}/openai/realtimeapi/transcription_sessions?api-version=2025-04-01-preview

Create an ephemeral API token for use in client-side applications with the Realtime API specifically for realtime transcriptions. Can be configured with the same session parameters as the transcription_session.update client event. It responds with a session object, plus a client_secret key which contains a usable ephemeral API token that can be used to authenticate browser clients for the Realtime API.

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
include	array	The set of items to include in the transcription. Current available items are: - `item.input_audio_transcription.logprobs`	No
input_audio_format	enum	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, input audio must be 16-bit PCM at a 24kHz sample rate, single channel (mono), and little-endian byte order. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
input_audio_noise_reduction	object	Configuration for input audio noise reduction. This can be set to `null` to turn off. Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model. Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.	No
└─ type	enum	Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones. Possible values: `near_field`, `far_field`	No
input_audio_transcription	object	Configuration for input audio transcription. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.	No
└─ language	string	The language of the input audio. Supplying the input language in ISO-639-1 (e.g. `en`) format will improve accuracy and latency.	No
└─ model	enum	The model to use for transcription, current options are `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, and `whisper-1`. Possible values: `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, `whisper-1`	No
└─ prompt	string	An optional text to guide the model's style or continue a previous audio segment. For `whisper-1`, the prompt is a list of keywords. For `gpt-4o-transcribe` models, the prompt is a free text string, for example "expect words related to technology".	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
turn_detection	object	Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech. Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with `uhhm`, the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.	No
└─ create_response	boolean	Whether or not to automatically generate a response when a VAD stop event occurs. Not available for transcription sessions.	No	True
└─ eagerness	enum	Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. Possible values: `low`, `medium`, `high`, `auto`	No
└─ interrupt_response	boolean	Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e. `conversation` of `auto`) when a VAD start event occurs. Not available for transcription sessions.	No	True
└─ prefix_padding_ms	integer	Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	enum	Type of turn detection. Possible values: `server_vad`, `semantic_vad`	No

Responses

Status Code: 200

Description: Session created successfully.

Content-Type	Type	Description
application/json	RealtimeTranscriptionSessionCreateResponse

Responses

POST https://{endpoint}/openai/responses?api-version=2025-04-01-preview

Creates a model response.

Request Body

Content-Type: application/json

Name	Type	Description	Required	Default
include	array		No
input	string or array	Model inputs	Yes
instructions	string	Inserts a system (or developer) message as the first item in the model's context. When using along with `previous_response_id`, the instructions from a previous response will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses.	No
max_output_tokens	integer	An upper bound for the number of tokens that can be generated for a response, including visible output tokens and conversation state.	No
parallel_tool_calls	boolean	Whether to allow the model to run tool calls in parallel.	No	True
previous_response_id	string	The unique ID of the previous response to the model. Use this to create multi-turn conversations. Learn more about conversation state.	No
reasoning	Reasoning	o-series models only Configuration options for reasoning models.	No
store	boolean	Whether to store the generated model response for later retrieval via API.	No	True
stream	boolean	If set to true, the model response data will be streamed to the client as it is generated using server-sent events. See the Streaming section below for more information.	No	False
text	object	Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - Text inputs and outputs - Structured Outputs	No
└─ format	TextResponseFormatConfiguration	An object specifying the format that the model must output. Configuring `{ "type": "json_schema" }` enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is `{ "type": "text" }` with no additional options. Not recommended for gpt-4o and newer models: Setting to `{ "type": "json_object" }` enables the older JSON mode, which ensures the message the model generates is valid JSON. Using `json_schema` is preferred for models that support it.	No
tool_choice	ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction	How the model should select which tool (or tools) to use when generating a response. See the `tools` parameter to see how to specify which tools the model can call.	No
tools	array	An array of tools the model may call while generating a response. You can specify which tool to use by setting the `tool_choice` parameter. The two categories of tools you can provide the model are: - Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities	No
truncation	enum	The truncation strategy to use for the model response. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: `auto`, `disabled`	No

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	response
text/event-stream	responseStreamEvent

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Responses API - input items

GET https://{endpoint}/openai/responses/{response_id}?api-version=2025-04-01-preview

Retrieves a model response with the given ID.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id	path	Yes	string	The ID of the response to retrieve.
include	query	No	array	Additional fields to include in the response. See the `include` parameter for Response creation above for more information.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	response

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Delete response

DELETE https://{endpoint}/openai/responses/{response_id}?api-version=2025-04-01-preview

Deletes a model response with the given ID.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id	path	Yes	string	The ID of the response to delete.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Status Code: 404

Description: Not Found

Content-Type	Type	Description
application/json	error

Status Code: default

Description: Service unavailable

Content-Type	Type	Description
application/json	errorResponse

Responses API - response item list

GET https://{endpoint}/openai/responses/{response_id}/input_items?api-version=2025-04-01-preview

Returns a list of input items for a given response.

URI Parameters

Name	In	Required	Type	Description
endpoint	path	Yes	string url	Supported Azure OpenAI endpoints (protocol and hostname, for example: `https://aoairesource.openai.azure.com`. Replace "aoairesource" with your Azure OpenAI resource name). https://{your-resource-name}.openai.azure.com
response_id	path	Yes	string	The ID of the response to retrieve input items for.
limit	query	No	integer	A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
order	query	No	string Possible values: `asc`, `desc`	The order to return the input items in. Default is `asc`. - `asc`: Return the input items in ascending order. - `desc`: Return the input items in descending order.
after	query	No	string	An item ID to list items after, used in pagination.
before	query	No	string	An item ID to list items before, used in pagination.

Request Header

Use either token based authentication or API key. Authenticating with token based authentication is recommended and more secure.

Name	Required	Type	Description
Authorization	True	string	Example: `Authorization: Bearer {Azure_OpenAI_Auth_Token}` To generate an auth token using Azure CLI: `az account get-access-token --resource https://cognitiveservices.azure.com` Type: oauth2 Authorization Url: `https://login.microsoftonline.com/common/oauth2/v2.0/authorize` scope: `https://cognitiveservices.azure.com/.default`
api-key	True	string	Provide Azure OpenAI API key here

Responses

Status Code: 200

Description: OK

Content-Type	Type	Description
application/json	responseItemList

Components

errorResponse

Name	Type	Description	Required	Default
error	error		No

errorBase

Name	Type	Description	Required	Default
code	string		No
message	string		No

error

Name	Type	Description	Required
inner_error	innerError	Inner error with additional details.	No
param	string		No
type	string		No

innerError

Inner error with additional details.

Name	Type	Description	Required	Default
code	innerErrorCode	Error codes for the inner error object.	No
content_filter_results	contentFilterPromptResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.	No

innerErrorCode

Error codes for the inner error object.

Property	Value
Description	Error codes for the inner error object.
Type	string
Values	`ResponsibleAIPolicyViolation`

dalleErrorResponse

Name	Type	Description	Required	Default
error	dalleError		No

dalleError

Name	Type	Description	Required
inner_error	dalleInnerError	Inner error with additional details.	No
param	string		No
type	string		No

dalleInnerError

Inner error with additional details.

Name	Type	Description	Required
code	innerErrorCode	Error codes for the inner error object.	No
content_filter_results	dalleFilterResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.	No
revised_prompt	string	The prompt that was used to generate the image, if there was any revision to the prompt.	No

contentFilterCompletionTextSpan

Describes a span within generated completion text. Offset 0 is the first UTF32 code point of the completion text.

Name	Type	Description	Required	Default
completion_end_offset	integer	Offset of the first UTF32 code point which is excluded from the span. This field is always equal to completion_start_offset for empty spans. This field is always larger than completion_start_offset for non-empty spans.	Yes
completion_start_offset	integer	Offset of the UTF32 code point which begins the span.	Yes

contentFilterResultBase

Name	Type	Description	Required	Default
filtered	boolean		Yes

contentFilterSeverityResult

Name	Type	Description	Required	Default
filtered	boolean		Yes
severity	string		No

contentFilterDetectedResult

Name	Type	Description	Required	Default
detected	boolean		No
filtered	boolean		Yes

contentFilterDetectedWithCitationResult

Name	Type	Required
citation	object	No
└─ URL	string	No
└─ license	string	No

contentFilterDetectedWithCompletionTextSpansResult

Name	Type	Description	Required	Default
details	array		No

contentFilterIdResult

Name	Type	Description	Required	Default
filtered	boolean		Yes
id	string		No

contentFilterResultsBase

Information about the content filtering results.

Name	Type	Description	Required
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No
hate	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
self_harm	contentFilterSeverityResult		No
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No

contentFilterPromptResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.

Name	Type	Description	Required
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No
hate	contentFilterSeverityResult		No
indirect_attack	contentFilterDetectedResult		No
jailbreak	contentFilterDetectedResult		No
profanity	contentFilterDetectedResult		No
self_harm	contentFilterSeverityResult		No
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No

contentFilterChoiceResults

Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about third party text and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.

Name	Type	Description	Required
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
error	errorBase		No
hate	contentFilterSeverityResult		No
profanity	contentFilterDetectedResult		No
protected_material_code	contentFilterDetectedWithCitationResult		No
protected_material_text	contentFilterDetectedResult		No
self_harm	contentFilterSeverityResult		No
sexual	contentFilterSeverityResult		No
ungrounded_material	contentFilterDetectedWithCompletionTextSpansResult		No
violence	contentFilterSeverityResult		No

contentFilterDetailedResults

Content filtering results with a detail of content filter ids for the filtered segments.

Name	Type	Description	Required	Default
details	array		No
filtered	boolean		Yes

promptFilterResult

Content filtering results for a single prompt in the request.

Name	Type	Description	Required	Default
content_filter_results	contentFilterPromptResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.	No
prompt_index	integer		No

promptFilterResults

Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.

No properties defined for this component.

dalleContentFilterResults

Information about the content filtering results.

Name	Type	Required
hate	contentFilterSeverityResult	No
self_harm	contentFilterSeverityResult	No
sexual	contentFilterSeverityResult	No
violence	contentFilterSeverityResult	No

dalleFilterResults

Name	Type	Description	Required
custom_blocklists	contentFilterDetailedResults	Content filtering results with a detail of content filter ids for the filtered segments.	No
hate	contentFilterSeverityResult		No
jailbreak	contentFilterDetectedResult		No
profanity	contentFilterDetectedResult		No
self_harm	contentFilterSeverityResult		No
sexual	contentFilterSeverityResult		No
violence	contentFilterSeverityResult		No

chatCompletionsRequestCommon

Name	Type	Description	Required	Default
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
max_tokens	integer	The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). This isn't compatible with o1 series models.	No	4096
metadata	object	Developer-defined tags and values used for filtering completions in the stored completions dashboard.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
stop	string or array	Up to 4 sequences where the API will stop generating further tokens.	No
store	boolean	Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products.	No
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.	No	False
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse.	No

createCompletionRequest

Name	Type	Description	Required	Default
best_of	integer	Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results can't be streamed. When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return. `best_of` must be greater than `n`. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
echo	boolean	Echo back the prompt in addition to the completion	No	False
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. As an example, you can pass `{"50256": -100}` to prevent the <\|endoftext\|> token from being generated.	No	None
logprobs	integer	Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response. The maximum value for `logprobs` is 5.	No	None
max_tokens	integer	The maximum number of tokensthat can be generated in the completion. The token count of your prompt plus `max_tokens` can't exceed the model's context length.	No	16
n	integer	How many completions to generate for each prompt. Note: Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.	No	1
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
prompt	string or array	The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. Note that <\|endoftext\|> is the document separator that the model sees during training, so if a prompt isn't specified the model will generate as if from the beginning of a new document.	Yes
seed	integer	If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to 4 sequences where the API will stop generating further tokens. The returned text won't contain the stop sequence.	No
stream	boolean	Whether to stream back partial progress. If set, tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message. Example Python code.	No	False
suffix	string	The suffix that comes after a completion of inserted text. This parameter is only supported for `gpt-3.5-turbo-instruct`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

createCompletionResponse

Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).

Name	Type	Description	Required
choices	array	The list of completion choices the model generated for the input prompt.	Yes
created	integer	The Unix timestamp (in seconds) of when the completion was created.	Yes
id	string	A unique identifier for the completion.	Yes
model	string	The model used for completion.	Yes
object	enum	The object type, which is always "text_completion" Possible values: `text_completion`	Yes
prompt_filter_results	promptFilterResults	Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.	No
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No
usage	completionUsage	Usage statistics for the completion request.	No

createChatCompletionRequest

Name	Type	Description	Required	Default
audio	object	Parameters for audio output. Required when audio output is requested with `modalities: ["audio"]`.	No
└─ format	enum	Specifies the output audio format. Must be one of `wav`, `mp3`, `flac`, `opus`, or `pcm16`. Possible values: `wav`, `mp3`, `flac`, `opus`, `pcm16`	No
└─ voice	enum	Specifies the voice type. Supported voices are `alloy`, `echo`, `fable`, `onyx`, `nova`, and `shimmer`. Possible values: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`	No
data_sources	array	The configuration entries for Azure OpenAI chat extensions that use them. This additional specification is only compatible with Azure OpenAI.	No
frequency_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.	No	0
function_call	string or chatCompletionFunctionCallOption	Deprecated in favor of `tool_choice`. Controls which (if any) function is called by the model. `none` means the model won't call a function and instead generates a message. `auto` means the model can pick between generating a message or calling a function. Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. `none` is the default when no functions are present. `auto` is the default if functions are present.	No
functions	array	Deprecated in favor of `tools`. A list of functions the model may generate JSON inputs for.	No
logit_bias	object	Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.	No	None
logprobs	boolean	Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.	No	False
max_completion_tokens	integer	An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. This is only supported in o1 series models. Will expand the support to other models in future API release.	No
max_tokens	integer	The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.	No
messages	array	A list of messages comprising the conversation so far. Example Python code.	Yes
metadata	object	Developer-defined tags and values used for filtering completions in the stored completions dashboard.	No
modalities	ChatCompletionModalities	Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default: `["text"]` The `gpt-4o-audio-preview` model can also be used to generate audio. To request that this model generate both text and audio responses, you can use: `["text", "audio"]`	No
n	integer	How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.	No	1
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
prediction	PredictionContent	Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.	No
presence_penalty	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.	No	0
reasoning_effort	enum	o1 models only Constrains effort on reasoning for reasoning models. Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Possible values: `low`, `medium`, `high`	No
response_format	ResponseFormatText or ResponseFormatJsonObject or ResponseFormatJsonSchema	An object specifying the format that the model must output. Compatible with GPT-4o, GPT-4o mini, GPT-4 Turbo and all GPT-3.5 Turbo models newer than `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which guarantee the model will match your supplied JSON schema. Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
seed	integer	This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism isn't guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.	No
stop	string or array	Up to 4 sequences where the API will stop generating further tokens.	No
store	boolean	Whether or not to store the output of this chat completion request for use in our model distillation or evaluation products.	No
stream	boolean	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message. Example Python code.	No	False
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
tool_choice	chatCompletionToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools. Specifying a particular tool via `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool. `none` is the default when no tools are present. `auto` is the default if tools are present.	No
tools	array	A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.	No
top_logprobs	integer	An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No
user_security_context	userSecurityContext	User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.	No

userSecurityContext

User security context contains several parameters that describe the AI application itself, and the end user that interacts with the AI application. These fields assist your security operations teams to investigate and mitigate security incidents by providing a comprehensive approach to protecting your AI applications. Learn more about protecting AI applications using Microsoft Defender for Cloud.

Name	Type	Description	Required
application_name	string	The name of the application. Sensitive personal information should not be included in this field.	No
end_user_id	string	This identifier is the Microsoft Entra ID (formerly Azure Active Directory) user object ID used to authenticate end-users within the generative AI application. Sensitive personal information should not be included in this field.	No
end_user_tenant_id	string	The Microsoft 365 tenant ID the end user belongs to. It's required when the generative AI application is multi tenant.	No
source_ip	string	Captures the original client's IP address, accepting both IPv4 and IPv6 formats.	No

chatCompletionFunctions

Name	Type	Description	Required
description	string	A description of what the function does, used by the model to choose when and how to call the function.	No
name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	Yes
parameters	FunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format. Omitting `parameters` defines a function with an empty parameter list.	No

chatCompletionFunctionCallOption

Specifying a particular function via {"name": "my_function"} forces the model to call that function.

Name	Type	Description	Required	Default
name	string	The name of the function to call.	Yes

chatCompletionFunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.

No properties defined for this component.

chatCompletionRequestMessage

This component can be one of the following:

ChatCompletionRequestDeveloperMessage
chatCompletionRequestSystemMessage
chatCompletionRequestUserMessage
chatCompletionRequestAssistantMessage
chatCompletionRequestToolMessage
chatCompletionRequestFunctionMessage

ChatCompletionRequestDeveloperMessage

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, developer messages replace the previous system messages.

Name	Type	Description	Required
content	string or array	The contents of the developer message.	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No
role	enum	The role of the messages author, in this case `developer`. Possible values: `developer`	Yes

chatCompletionRequestSystemMessage

Name	Type	Description	Required
content	string or array	The contents of the system message.	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No
role	enum	The role of the messages author, in this case `system`. Possible values: `system`	Yes

chatCompletionRequestUserMessage

Name	Type	Description	Required
content	string or array	The contents of the user message.	Yes
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No
role	enum	The role of the messages author, in this case `user`. Possible values: `user`	Yes

chatCompletionRequestAssistantMessage

Name	Type	Description	Required
content	string or array	The contents of the assistant message. Required unless `tool_calls` or `function_call` is specified.	No
function_call	object	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No
└─ arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	No
└─ name	string	The name of the function to call.	No
name	string	An optional name for the participant. Provides the model information to differentiate between participants of the same role.	No
refusal	string	The refusal message by the assistant.	No
role	enum	The role of the messages author, in this case `assistant`. Possible values: `assistant`	Yes
tool_calls	chatCompletionMessageToolCalls	The tool calls generated by the model, such as function calls.	No

chatCompletionRequestToolMessage

Name	Type	Description	Required
content	string or array	The contents of the tool message.	Yes
role	enum	The role of the messages author, in this case `tool`. Possible values: `tool`	Yes
tool_call_id	string	Tool call that this message is responding to.	Yes

chatCompletionRequestFunctionMessage

Name	Type	Description	Required
content	string	The contents of the function message.	Yes
name	string	The name of the function to call.	Yes
role	enum	The role of the messages author, in this case `function`. Possible values: `function`	Yes

chatCompletionRequestDeveloperMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

chatCompletionRequestSystemMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

chatCompletionRequestUserMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText
chatCompletionRequestMessageContentPartImage
chatCompletionRequestMessageContentPartAudio

chatCompletionRequestAssistantMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText
chatCompletionRequestMessageContentPartRefusal

chatCompletionRequestToolMessageContentPart

This component can be one of the following:

chatCompletionRequestMessageContentPartText

chatCompletionRequestMessageContentPartText

Name	Type	Description	Required	Default
text	string	The text content.	Yes
type	enum	The type of the content part. Possible values: `text`	Yes

chatCompletionRequestMessageContentPartAudio

Name	Type	Description	Required
input_audio	object		Yes
└─ data	string	Base64 encoded audio data.	No
└─ format	enum	The format of the encoded audio data. Currently supports "wav" and "mp3". Possible values: `wav`, `mp3`	No
type	enum	The type of the content part. Always `input_audio`. Possible values: `input_audio`	Yes

chatCompletionRequestMessageContentPartImage

Name	Type	Description	Required
image_url	object		Yes
└─ detail	enum	Specifies the detail level of the image. Learn more in the Vision guide. Possible values: `auto`, `low`, `high`	No
└─ url	string	Either a URL of the image or the base64 encoded image data.	No
type	enum	The type of the content part. Possible values: `image_url`	Yes

chatCompletionRequestMessageContentPartRefusal

Name	Type	Description	Required	Default
refusal	string	The refusal message generated by the model.	Yes
type	enum	The type of the content part. Possible values: `refusal`	Yes

azureChatExtensionConfiguration

A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.

###Discriminator for azureChatExtensionConfiguration

This component uses the property type to discriminate between different types:

Type Value	Schema
`azure_search`	azureSearchChatExtensionConfiguration
`azure_cosmos_db`	azureCosmosDBChatExtensionConfiguration
`elasticsearch`	elasticsearchChatExtensionConfiguration
`mongo_db`	mongoDBChatExtensionConfiguration
`pinecone`	pineconeChatExtensionConfiguration

Name	Type	Description	Required	Default
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

azureChatExtensionType

Property	Value
Description	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.
Type	string
Values	`azure_search` `azure_cosmos_db` `elasticsearch` `mongo_db` `pinecone`

azureSearchChatExtensionConfiguration

A specific representation of configurable options for Azure Search when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
parameters	azureSearchChatExtensionParameters	Parameters for Azure Search when used as an Azure OpenAI chat extension.	No
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

azureSearchChatExtensionParameters

Parameters for Azure Search when used as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
authentication	onYourDataApiKeyAuthenticationOptions or onYourDataSystemAssignedManagedIdentityAuthenticationOptions or onYourDataUserAssignedManagedIdentityAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions		Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataIntegratedVectorizationSource		No
endpoint	string	The absolute endpoint path for the Azure Search resource to use.	Yes
fields_mapping	azureSearchIndexFieldMappingOptions	Optional settings to control how fields are processed when using a configured Azure Search resource.	No
filter	string	Search filter.	No
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No
index_name	string	The name of the index to use as available in the referenced Azure Search resource.	Yes
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
query_type	azureSearchQueryType	The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.	No
semantic_configuration	string	The additional semantic configuration for the query.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No

azureSearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Search resource.

Name	Type	Description	Required
content_fields	array	The names of index fields that should be treated as content.	No
content_fields_separator	string	The separator pattern that content fields should use.	No
filepath_field	string	The name of the index field to use as a filepath.	No
image_vector_fields	array	The names of fields that represent image vector data.	No
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
vector_fields	array	The names of fields that represent vector data.	No

azureSearchQueryType

The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Property	Value
Description	The type of Azure Search retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Type	string
Values	`simple` `semantic` `vector` `vector_simple_hybrid` `vector_semantic_hybrid`

azureCosmosDBChatExtensionConfiguration

A specific representation of configurable options for Azure Cosmos DB when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
parameters	azureCosmosDBChatExtensionParameters	Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.	No
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

azureCosmosDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Azure Cosmos DB for MongoDB vCore.

Name	Type	Description	Required	Default
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
authentication	onYourDataConnectionStringAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using a connection string.	Yes
container_name	string	The name of the Azure Cosmos DB resource container.	Yes
database_name	string	The MongoDB vCore database name to use with Azure Cosmos DB.	Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource		Yes
fields_mapping	azureCosmosDBFieldMappingOptions	Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.	Yes
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No
index_name	string	The MongoDB vCore index name to use with Azure Cosmos DB.	Yes
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No

azureCosmosDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Azure Cosmos DB resource.

Name	Type	Description	Required
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No
filepath_field	string	The name of the index field to use as a filepath.	No
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
vector_fields	array	The names of fields that represent vector data.	Yes

elasticsearchChatExtensionConfiguration

A specific representation of configurable options for Elasticsearch when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
parameters	elasticsearchChatExtensionParameters	Parameters to use when configuring ElasticsearchÂ® as an Azure OpenAI chat extension.	No
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

elasticsearchChatExtensionParameters

Parameters to use when configuring ElasticsearchÂ® as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
authentication	onYourDataKeyAndKeyIdAuthenticationOptions or onYourDataEncodedApiKeyAuthenticationOptions		Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource or onYourDataModelIdVectorizationSource		No
endpoint	string	The endpoint of ElasticsearchÂ®.	Yes
fields_mapping	elasticsearchIndexFieldMappingOptions	Optional settings to control how fields are processed when using a configured ElasticsearchÂ® resource.	No
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No
index_name	string	The index name of ElasticsearchÂ®.	Yes
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
query_type	elasticsearchQueryType	The type of ElasticsearchÂ® retrieval query that should be executed when using it as an Azure OpenAI chat extension.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No

elasticsearchIndexFieldMappingOptions

Optional settings to control how fields are processed when using a configured ElasticsearchÂ® resource.

Name	Type	Description	Required
content_fields	array	The names of index fields that should be treated as content.	No
content_fields_separator	string	The separator pattern that content fields should use.	No
filepath_field	string	The name of the index field to use as a filepath.	No
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
vector_fields	array	The names of fields that represent vector data.	No

elasticsearchQueryType

The type of ElasticsearchÂ® retrieval query that should be executed when using it as an Azure OpenAI chat extension.

Property	Value
Description	The type of ElasticsearchÂ® retrieval query that should be executed when using it as an Azure OpenAI chat extension.
Type	string
Values	`simple` `vector`

mongoDBChatExtensionConfiguration

A specific representation of configurable options for Mongo DB when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
parameters	mongoDBChatExtensionParameters	Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.	No
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

mongoDBChatExtensionParameters

Parameters to use when configuring Azure OpenAI On Your Data chat extensions when using Mongo DB.

Name	Type	Description	Required	Default
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
app_name	string	The name of the Mongo DB Application.	Yes
authentication	onYourDataUsernameAndPasswordAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using a username and a password.	Yes
collection_name	string	The name of the Mongo DB Collection.	Yes
database_name	string	The name of the Mongo DB database.	Yes
embedding_dependency	onYourDataEndpointVectorizationSource or onYourDataDeploymentNameVectorizationSource		Yes
endpoint	string	The name of the Mongo DB cluster endpoint.	Yes
fields_mapping	mongoDBFieldMappingOptions	Optional settings to control how fields are processed when using a configured Mongo DB resource.	Yes
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No
index_name	string	The The name of the Mongo DB index.	Yes
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No

mongoDBFieldMappingOptions

Optional settings to control how fields are processed when using a configured Mongo DB resource.

Name	Type	Description	Required
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No
filepath_field	string	The name of the index field to use as a filepath.	No
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No
vector_fields	array	The names of fields that represent vector data.	Yes

pineconeChatExtensionConfiguration

A specific representation of configurable options for Pinecone when using it as an Azure OpenAI chat extension.

Name	Type	Description	Required	Default
parameters	pineconeChatExtensionParameters	Parameters for configuring Azure OpenAI Pinecone chat extensions.	No
type	azureChatExtensionType	A representation of configuration data for a single Azure OpenAI chat extension. This will be used by a chat completions request that should use Azure OpenAI chat extensions to augment the response behavior. The use of this configuration is compatible only with Azure OpenAI.	Yes

pineconeChatExtensionParameters

Parameters for configuring Azure OpenAI Pinecone chat extensions.

Name	Type	Description	Required	Default
allow_partial_result	boolean	If specified as true, the system will allow partial search results to be used and the request fails if all the queries fail. If not specified, or specified as false, the request will fail if any search query fails.	No	False
authentication	onYourDataApiKeyAuthenticationOptions	The authentication options for Azure OpenAI On Your Data when using an API key.	Yes
embedding_dependency	onYourDataDeploymentNameVectorizationSource	The details of a a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.	Yes
environment	string	The environment name of Pinecone.	Yes
fields_mapping	pineconeFieldMappingOptions	Optional settings to control how fields are processed when using a configured Pinecone resource.	Yes
in_scope	boolean	Whether queries should be restricted to use of indexed data.	No
include_contexts	array	The included properties of the output context. If not specified, the default value is `citations` and `intent`.	No
index_name	string	The name of the Pinecone database index.	Yes
max_search_queries	integer	The max number of rewritten queries should be send to search provider for one user message. If not specified, the system will decide the number of queries to send.	No
strictness	integer	The configured strictness of the search relevance filtering. The higher of strictness, the higher of the precision but lower recall of the answer.	No
top_n_documents	integer	The configured top number of documents to feature for the configured query.	No

pineconeFieldMappingOptions

Optional settings to control how fields are processed when using a configured Pinecone resource.

Name	Type	Description	Required
content_fields	array	The names of index fields that should be treated as content.	Yes
content_fields_separator	string	The separator pattern that content fields should use.	No
filepath_field	string	The name of the index field to use as a filepath.	No
title_field	string	The name of the index field to use as a title.	No
url_field	string	The name of the index field to use as a URL.	No

onYourDataAuthenticationOptions

The authentication options for Azure OpenAI On Your Data.

Discriminator for onYourDataAuthenticationOptions

This component uses the property type to discriminate between different types:

Type Value	Schema
`api_key`	onYourDataApiKeyAuthenticationOptions
`connection_string`	onYourDataConnectionStringAuthenticationOptions
`key_and_key_id`	onYourDataKeyAndKeyIdAuthenticationOptions
`encoded_api_key`	onYourDataEncodedApiKeyAuthenticationOptions
`access_token`	onYourDataAccessTokenAuthenticationOptions
`system_assigned_managed_identity`	onYourDataSystemAssignedManagedIdentityAuthenticationOptions
`user_assigned_managed_identity`	onYourDataUserAssignedManagedIdentityAuthenticationOptions
`username_and_password`	onYourDataUsernameAndPasswordAuthenticationOptions

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataContextProperty

The context property.

Property	Value
Description	The context property.
Type	string
Values	`citations` `intent` `all_retrieved_documents`

onYourDataAuthenticationType

The authentication types supported with Azure OpenAI On Your Data.

Property	Value
Description	The authentication types supported with Azure OpenAI On Your Data.
Type	string
Values	`api_key` `connection_string` `key_and_key_id` `encoded_api_key` `access_token` `system_assigned_managed_identity` `user_assigned_managed_identity` `username_and_password`

onYourDataApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an API key.

Name	Type	Description	Required	Default
key	string	The API key to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataConnectionStringAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a connection string.

Name	Type	Description	Required	Default
connection_string	string	The connection string to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataKeyAndKeyIdAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch key and key ID pair.

Name	Type	Description	Required
key	string	The Elasticsearch key to use for authentication.	No
key_id	string	The Elasticsearch key ID to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataEncodedApiKeyAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using an Elasticsearch encoded API key.

Name	Type	Description	Required	Default
encoded_api_key	string	The Elasticsearch encoded API key to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataAccessTokenAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using access token.

Name	Type	Description	Required	Default
access_token	string	The access token to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataSystemAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a system-assigned managed identity.

Name	Type	Description	Required	Default
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataUserAssignedManagedIdentityAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a user-assigned managed identity.

Name	Type	Description	Required	Default
managed_identity_resource_id	string	The resource ID of the user-assigned managed identity to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes

onYourDataUsernameAndPasswordAuthenticationOptions

The authentication options for Azure OpenAI On Your Data when using a username and a password.

Name	Type	Description	Required
password	string	The password. to use for authentication.	No
type	onYourDataAuthenticationType	The authentication types supported with Azure OpenAI On Your Data.	Yes
username	string	The username to use for authentication.	No

onYourDataVectorizationSource

An abstract representation of a vectorization source for Azure OpenAI On Your Data with vector search.

This component uses the property type to discriminate between different types:

Type Value	Schema
`endpoint`	onYourDataEndpointVectorizationSource
`deployment_name`	onYourDataDeploymentNameVectorizationSource
`integrated`	onYourDataIntegratedVectorizationSource
`model_id`	onYourDataModelIdVectorizationSource

Name	Type	Description	Required	Default
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataVectorizationSourceType

Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.

Property	Value
Description	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.
Type	string
Values	`endpoint` `deployment_name` `integrated` `model_id`

onYourDataEndpointVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a public Azure OpenAI endpoint call for embeddings.

Name	Type	Description	Required
authentication	onYourDataApiKeyAuthenticationOptions or onYourDataAccessTokenAuthenticationOptions		No
dimensions	integer	The number of dimensions the embeddings should have. Only supported in `text-embedding-3` and later models.	No
endpoint	string	Specifies the resource endpoint URL from which embeddings should be retrieved. It should be in the format of `https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/embeddings`. The api-version query parameter isn't allowed.	No
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataDeploymentNameVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on an internal embeddings model deployment name in the same Azure OpenAI resource.

Name	Type	Description	Required
deployment_name	string	Specifies the name of the model deployment to use for vectorization. This model deployment must be in the same Azure OpenAI resource, but On Your Data will use this model deployment via an internal call rather than a public one, which enables vector search even in private networks.	No
dimensions	integer	The number of dimensions the embeddings should have. Only supported in `text-embedding-3` and later models.	No
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataIntegratedVectorizationSource

Represents the integrated vectorizer defined within the search resource.

Name	Type	Description	Required	Default
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

onYourDataModelIdVectorizationSource

The details of a vectorization source, used by Azure OpenAI On Your Data when applying vector search, that is based on a search service model ID. Currently only supported by ElasticsearchÂ®.

Name	Type	Description	Required	Default
model_id	string	Specifies the model ID to use for vectorization. This model ID must be defined in the search service.	No
type	onYourDataVectorizationSourceType	Represents the available sources Azure OpenAI On Your Data can use to configure vectorization of data for use with vector search.	Yes

azureChatExtensionsMessageContext

A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.

Name	Type	Description	Required
all_retrieved_documents	array	All the retrieved documents.	No
citations	array	The data source retrieval result, used to generate the assistant message in the response.	No
intent	string	The detected intent from the chat history, used to pass to the next turn to carry over the context.	No

citation

citation information for a chat completions response message.

Name	Type	Description	Required
chunk_id	string	The chunk ID of the citation.	No
content	string	The content of the citation.	Yes
filepath	string	The file path of the citation.	No
rerank_score	number	The rerank score of the retrieved document.	No
title	string	The title of the citation.	No
url	string	The URL of the citation.	No

retrievedDocument

The retrieved document.

Name	Type	Description	Required
chunk_id	string	The chunk ID of the citation.	No
content	string	The content of the citation.	Yes
data_source_index	integer	The index of the data source.	No
filepath	string	The file path of the citation.	No
filter_reason	filterReason	The filtering reason of the retrieved document.	No
original_search_score	number	The original search score of the retrieved document.	No
rerank_score	number	The rerank score of the retrieved document.	No
search_queries	array	The search queries used to retrieve the document.	No
title	string	The title of the citation.	No
url	string	The URL of the citation.	No

filterReason

The filtering reason of the retrieved document.

Property	Value
Description	The filtering reason of the retrieved document.
Type	string
Values	`score` `rerank`

chatCompletionMessageToolCall

Name	Type	Description	Required
function	object	The function that the model called.	Yes
└─ arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	No
└─ name	string	The name of the function to call.	No
id	string	The ID of the tool call.	Yes
type	toolCallType	The type of the tool call, in this case `function`.	Yes

toolCallType

The type of the tool call, in this case function.

Property	Value
Description	The type of the tool call, in this case `function`.
Type	string
Values	`function`

chatCompletionRequestMessageTool

Name	Type	Description	Required	Default
content	string	The contents of the message.	No
tool_call_id	string	Tool call that this message is responding to.	No

chatCompletionRequestMessageFunction

Name	Type	Description	Required
content	string	The contents of the message.	No
name	string	The contents of the message.	No
role	enum	The role of the messages author, in this case `function`. Possible values: `function`	No

createChatCompletionResponse

Represents a chat completion response returned by model, based on the provided input.

Name	Type	Description	Required
choices	array	A list of chat completion choices. Can be more than one if `n` is greater than 1.	Yes
created	integer	The Unix timestamp (in seconds) of when the chat completion was created.	Yes
id	string	A unique identifier for the chat completion.	Yes
model	string	The model used for the chat completion.	Yes
object	enum	The object type, which is always `chat.completion`. Possible values: `chat.completion`	Yes
prompt_filter_results	promptFilterResults	Content filtering results for zero or more prompts in the request. In a streaming request, results for different prompts may arrive at different times or in different orders.	No
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No
usage	completionUsage	Usage statistics for the completion request.	No

createChatCompletionStreamResponse

Represents a streamed chunk of a chat completion response returned by model, based on the provided input.

Name	Type	Description	Required
choices	array	A list of chat completion choices. Can contain more than one elements if `n` is greater than 1.	Yes
created	integer	The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp.	Yes
id	string	A unique identifier for the chat completion. Each chunk has the same ID.	Yes
model	string	The model to generate the completion.	Yes
object	enum	The object type, which is always `chat.completion.chunk`. Possible values: `chat.completion.chunk`	Yes
system_fingerprint	string	This fingerprint represents the backend configuration that the model runs with. Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.	No

chatCompletionStreamResponseDelta

A chat completion delta generated by streamed model responses.

Name	Type	Description	Required
content	string	The contents of the chunk message.	No
function_call	object	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No
└─ arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	No
└─ name	string	The name of the function to call.	No
refusal	string	The refusal message generated by the model.	No
role	enum	The role of the author of this message. Possible values: `system`, `user`, `assistant`, `tool`	No
tool_calls	array		No

chatCompletionMessageToolCallChunk

Name	Type	Description	Required
function	object		No
└─ arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	No
└─ name	string	The name of the function to call.	No
id	string	The ID of the tool call.	No
index	integer		Yes
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: `function`	No

chatCompletionStreamOptions

Options for streaming response. Only set this when you set stream: true.

Name	Type	Description	Required	Default
include_usage	boolean	If set, an additional chunk will be streamed before the `data: [DONE]` message. The `usage` field on this chunk shows the token usage statistics for the entire request, and the `choices` field will always be an empty array. All other chunks will also include a `usage` field, but with a null value.	No

chatCompletionChoiceLogProbs

Log probability information for the choice.

Name	Type	Description	Required	Default
content	array	A list of message content tokens with log probability information.	Yes
refusal	array	A list of message refusal tokens with log probability information.	No

chatCompletionTokenLogprob

Name	Type	Description	Required
bytes	array	A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be `null` if there is no bytes representation for the token.	Yes
logprob	number	The log probability of this token.	Yes
token	string	The token.	Yes
top_logprobs	array	List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested `top_logprobs` returned.	Yes

chatCompletionResponseMessage

A chat completion message generated by the model.

Name	Type	Description	Required
audio	object	If the audio output modality is requested, this object contains data about the audio response from the model.	No
└─ data	string	Base64 encoded audio bytes generated by the model, in the format specified in the request.	No
└─ expires_at	integer	The Unix timestamp (in seconds) for when this audio response will no longer be accessible on the server for use in multi-turn conversations.	No
└─ id	string	Unique identifier for this audio response.	No
└─ transcript	string	Transcript of the audio generated by the model.	No
content	string	The contents of the message.	Yes
context	azureChatExtensionsMessageContext	A representation of the additional context information available when Azure OpenAI chat extensions are involved in the generation of a corresponding chat completions response. This context information is only populated when using an Azure OpenAI request configured to use a matching extension.	No
function_call	chatCompletionFunctionCall	Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model.	No
refusal	string	The refusal message generated by the model.	Yes
role	chatCompletionResponseMessageRole	The role of the author of the response message.	Yes
tool_calls	array	The tool calls generated by the model, such as function calls.	No

chatCompletionResponseMessageRole

The role of the author of the response message.

Property	Value
Description	The role of the author of the response message.
Type	string
Values	`assistant`

chatCompletionToolChoiceOption

Controls which (if any) tool is called by the model. none means the model will not call any tool and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools. Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. none is the default when no tools are present. auto is the default if tools are present.

This component can be one of the following:

chatCompletionNamedToolChoice

chatCompletionNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific function.

Name	Type	Description	Required
function	object		Yes
└─ name	string	The name of the function to call.	No
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: `function`	Yes

ParallelToolCalls

Whether to enable parallel function calling during tool use.

No properties defined for this component.

PredictionContent

Static predicted output content, such as the content of a text file that is being regenerated.

Name	Type	Description	Required	Default
content	string or array	The content that should be matched when generating a model response. If generated tokens would match this content, the entire model response can be returned much more quickly.	Yes
type	enum	The type of the predicted content you want to provide. This type is currently always `content`. Possible values: `content`	Yes

chatCompletionMessageToolCalls

The tool calls generated by the model, such as function calls.

No properties defined for this component.

ChatCompletionModalities

Output types that you would like the model to generate for this request. Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use:

["text", "audio"]

No properties defined for this component.

chatCompletionFunctionCall

Deprecated and replaced by tool_calls. The name and arguments of a function that should be called, as generated by the model.

Name	Type	Description	Required	Default
arguments	string	The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.	Yes
name	string	The name of the function to call.	Yes

completionUsage

Usage statistics for the completion request.

Name	Type	Description	Required
completion_tokens	integer	Number of tokens in the generated completion.	Yes
completion_tokens_details	object	Breakdown of tokens used in a completion.	No
└─ accepted_prediction_tokens	integer	When using Predicted Outputs, the number of tokens in the prediction that appeared in the completion.	No
└─ audio_tokens	integer	Audio input tokens generated by the model.	No
└─ reasoning_tokens	integer	Tokens generated by the model for reasoning.	No
└─ rejected_prediction_tokens	integer	When using Predicted Outputs, the number of tokens in the prediction that did not appear in the completion. However, like reasoning tokens, these tokens are still counted in the total completion tokens for purposes of billing, output, and context window limits.	No
prompt_tokens	integer	Number of tokens in the prompt.	Yes
prompt_tokens_details	object	Details of the prompt tokens.	No
└─ audio_tokens	integer	Audio input tokens present in the prompt.	No
└─ cached_tokens	integer	The number of cached prompt tokens.	No
total_tokens	integer	Total number of tokens used in the request (prompt + completion).	Yes

chatCompletionTool

Name	Type	Description	Required	Default
function	FunctionObject		Yes
type	enum	The type of the tool. Currently, only `function` is supported. Possible values: `function`	Yes

FunctionParameters

The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.

No properties defined for this component.

FunctionObject

Name	Type	Description	Required	Default
description	string	A description of what the function does, used by the model to choose when and how to call the function.	No
name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	Yes
parameters	FunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format. Omitting `parameters` defines a function with an empty parameter list.	No
strict	boolean	Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the `parameters` field. Only a subset of JSON Schema is supported when `strict` is `true`.	No	False

ResponseFormatText

Name	Type	Description	Required	Default
type	enum	The type of response format being defined: `text` Possible values: `text`	Yes

ResponseFormatJsonObject

Name	Type	Description	Required	Default
type	enum	The type of response format being defined: `json_object` Possible values: `json_object`	Yes

ResponseFormatJsonSchemaSchema

The schema for the response format, described as a JSON Schema object.

No properties defined for this component.

ResponseFormatJsonSchema

Name	Type	Description	Required	Default
json_schema	object		Yes
└─ description	string	A description of what the response format is for, used by the model to determine how to respond in the format.	No
└─ name	string	The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	No
└─ schema	ResponseFormatJsonSchemaSchema	The schema for the response format, described as a JSON Schema object.	No
└─ strict	boolean	Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the `schema` field. Only a subset of JSON Schema is supported when `strict` is `true`.	No	False
type	enum	The type of response format being defined: `json_schema` Possible values: `json_schema`	Yes

chatCompletionChoiceCommon

Name	Type	Description	Required	Default
finish_reason	string		No
index	integer		No

createTranslationRequest

Translation request.

Name	Type	Description	Required	Default
file	string	The audio file to translate.	Yes
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0

audioResponse

Translation or transcription response when response_format was json

Name	Type	Description	Required	Default
text	string	Translated or transcribed text.	Yes

audioVerboseResponse

Translation or transcription response when response_format was verbose_json

Name	Type	Description	Required
duration	number	Duration.	No
language	string	Language.	No
segments	array		No
task	string	Type of audio task.	No
text	string	Translated or transcribed text.	Yes
words	array		No

audioResponseFormat

Defines the format of the output.

Property	Value
Description	Defines the format of the output.
Type	string
Values	`json` `text` `srt` `verbose_json` `vtt`

createTranscriptionRequest

Transcription request.

Name	Type	Description	Required	Default
file	string	The audio file object to transcribe.	Yes
language	string	The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.	No
prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.	No
response_format	audioResponseFormat	Defines the format of the output.	No
temperature	number	The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.	No	0
timestamp_granularities[]	array	The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.	No	['segment']

audioSegment

Transcription or translation segment.

Name	Type	Description	Required
avg_logprob	number	Average log probability.	No
compression_ratio	number	Compression ratio.	No
end	number	Segment end offset.	No
id	integer	Segment identifier.	No
no_speech_prob	number	Probability of 'no speech'.	No
seek	number	Offset of the segment.	No
start	number	Segment start offset.	No
temperature	number	Temperature.	No
text	string	Segment text.	No
tokens	array	Tokens of the text.	No

audioWord

Transcription or translation word.

Name	Type	Description	Required
end	number	Word end offset.	No
start	number	Word start offset.	No
word	string	Word	No

createSpeechRequest

Speech request.

Name	Type	Description	Required	Default
input	string	The text to synthesize audio for. The maximum length is 4096 characters.	Yes
response_format	enum	The format to synthesize the audio in. Possible values: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`	No
speed	number	The speed of the synthesized audio. Select a value from `0.25` to `4.0`. `1.0` is the default.	No	1.0
voice	enum	The voice to use for speech synthesis. Possible values: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`	Yes

imageQuality

The quality of the image that will be generated.

Property	Value
Description	The quality of the image that will be generated.
Type	string
Default	auto
Values	`auto` `high` `medium` `low` `hd` `standard`

imagesResponseFormat

The format in which the generated images are returned.

Property	Value
Description	The format in which the generated images are returned.
Type	string
Default	url
Values	`url` `b64_json`

imagesOutputFormat

The file format in which the generated images are returned. Only supported for gpt-image-1.

Property	Value
Description	The file format in which the generated images are returned. Only supported for gpt-image-1.
Type	string
Default	png
Values	`png` `jpeg`

imageSize

The size of the generated images.

Property	Value
Description	The size of the generated images.
Type	string
Default	auto
Values	`auto` `1792x1024` `1024x1792` `1024x1024` `1024x1536` `1536x1024`

imageStyle

The style of the generated images. Only supported for dall-e-3.

Property	Value
Description	The style of the generated images. Only supported for dall-e-3.
Type	string
Default	vivid
Values	`vivid` `natural`

imageBackground

Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1.

Property	Value
Description	Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1.
Type	string
Default	auto
Values	`transparent` `opaque` `auto`

imageGenerationsRequest

Name	Type	Description	Required	Default
background	imageBackground	Allows to set transparency for the background of the generated image(s). This parameter is only supported for gpt-image-1.	No	auto
n	integer	The number of images to generate. For dall-e-3, only n=1 is supported.	No	1
output_compression	integer	The compression level (0-100%) for the generated images. This parameter is only supported for gpt-image-1 with the jpeg output format.	No	100
output_format	imagesOutputFormat	The file format in which the generated images are returned. Only supported for gpt-image-1.	No	png
prompt	string	A text description of the desired image(s). The maximum length is 32000 characters for gpt-image-1 and 4000 characters for dall-e-3	Yes
quality	imageQuality	The quality of the image that will be generated.	No	auto
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
size	imageSize	The size of the generated images.	No	auto
style	imageStyle	The style of the generated images. Only supported for dall-e-3.	No	vivid
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

imageEditsRequest

Name	Type	Description	Required	Default
image	string or array	The image(s) to edit. Must be a supported image file or an array of images. Each image should be a png, or jpg file less than 25MB.	Yes
mask	string	An additional image whose fully transparent areas (e.g., where alpha is zero) indicate where the image should be edited. If there are multiple images provided, the mask will be applied to the first image. Must be a valid PNG file, less than 4MB, and have the same dimensions as the image.	No
n	integer	The number of images to generate.	No	1
prompt	string	A text description of the desired image(s). The maximum length is 32000 characters.	Yes
quality	imageQuality	The quality of the image that will be generated.	No	auto
response_format	imagesResponseFormat	The format in which the generated images are returned.	No	url
size	imageSize	The size of the generated images.	No	auto
user	string	A unique identifier representing your end-user, which can help to monitor and detect abuse.	No

generateImagesResponse

Name	Type	Description	Required
created	integer	The unix timestamp when the operation was created.	Yes
data	array	The result data of the operation, if successful	Yes
usage	imageGenerationsUsage	Represents token usage details for image generation requests. Only for gpt-image-1.	No

imageResult

The image url or encoded image if successful, and an error otherwise.

Name	Type	Description	Required
b64_json	string	The base64 encoded image	No
content_filter_results	dalleContentFilterResults	Information about the content filtering results.	No
prompt_filter_results	dalleFilterResults	Information about the content filtering category (hate, sexual, violence, self_harm), if it has been detected, as well as the severity level (very_low, low, medium, high-scale that determines the intensity and risk level of harmful content) and if it has been filtered or not. Information about jailbreak content and profanity, if it has been detected, and if it has been filtered or not. And information about customer block list, if it has been filtered and its id.	No
revised_prompt	string	The prompt that was used to generate the image, if there was any revision to the prompt.	No
url	string	The image url.	No

imageGenerationsUsage

Represents token usage details for image generation requests. Only for gpt-image-1.

Name	Type	Description	Required
input_tokens	integer	The number of input tokens.	No
input_tokens_details	object	A detailed breakdown of the input tokens.	No
└─ image_tokens	integer	The number of image tokens.	No
└─ text_tokens	integer	The number of text tokens.	No
output_tokens	integer	The number of output tokens.	No
total_tokens	integer	The total number of tokens used.	No

line

A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.

Name	Type	Description	Required	Default
spans	array	An array of spans that represent detected objects and its bounding box information.	Yes
text	string		Yes

span

A span object that represents a detected object and its bounding box information.

Name	Type	Description	Required
length	integer	The length of the span in characters, measured in Unicode codepoints.	Yes
offset	integer	The character offset within the text where the span begins. This offset is defined as the position of the first character of the span, counting from the start of the text as Unicode codepoints.	Yes
polygon	array	An array of objects representing points in the polygon that encloses the detected object.	Yes
text	string	The text content of the span that represents the detected object.	Yes

runCompletionUsage

Usage statistics related to the run. This value will be null if the run isn't in a terminal state (i.e. in_progress, queued, etc.).

Name	Type	Description	Required
completion_tokens	integer	Number of completion tokens used over the course of the run.	Yes
prompt_tokens	integer	Number of prompt tokens used over the course of the run.	Yes
total_tokens	integer	Total number of tokens used (prompt + completion).	Yes

runStepCompletionUsage

Usage statistics related to the run step. This value will be null while the run step's status is in_progress.

Name	Type	Description	Required
completion_tokens	integer	Number of completion tokens used over the course of the run step.	Yes
prompt_tokens	integer	Number of prompt tokens used over the course of the run step.	Yes
total_tokens	integer	Total number of tokens used (prompt + completion).	Yes

assistantsApiResponseFormatOption

Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.

Setting to { "type": "json_object" } enables JSON mode, which ensures the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

This component can be one of the following:

ResponseFormatText
ResponseFormatJsonObject
ResponseFormatJsonSchema

assistantsApiResponseFormat

An object describing the expected output of the model. If json_object only function type tools are allowed to be passed to the Run. If text the model can return text or any value needed.

Name	Type	Description	Required	Default
type	string	Must be one of `text` or `json_object`.	No	text

type Enum: AssistantsApiResponseFormat

Value	Description
text
json_object

assistantObject

Represents an assistant that can call the model and use tools.

Name	Type	Description	Required	Default
created_at	integer	The Unix timestamp (in seconds) for when the assistant was created.	Yes
description	string	The description of the assistant. The maximum length is 512 characters.	Yes
id	string	The identifier, which can be referenced in API endpoints.	Yes
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
model	string	ID of the model to use.	Yes
name	string	The name of the assistant. The maximum length is 256 characters.	Yes
object	string	The object type, which is always `assistant`.	Yes
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `file_search`, or `function`.	Yes	[]
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1

object Enum: AssistantObjectType

Value	Description
assistant	The object type, which is always assistant

createAssistantRequest

Name	Type	Description	Required	Default
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 256,000 characters.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string		Yes
name	string	The name of the assistant. The maximum length is 256 characters.	No
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
└─ vector_stores	array	A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1

modifyAssistantRequest

Name	Type	Description	Required	Default
description	string	The description of the assistant. The maximum length is 512 characters.	No
instructions	string	The system instructions that the assistant uses. The maximum length is 32768 characters.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string		No
name	string	The name of the assistant. The maximum length is 256 characters.	No
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	Overrides the list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	Overrides the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`.	No	[]
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1

deleteAssistantResponse

Name	Type	Required
deleted	boolean	Yes
id	string	Yes
object	string	Yes

object Enum: DeleteAssistantResponseState

Value	Description
assistant.deleted

listAssistantsResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

assistantToolsCode

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `code_interpreter`	Yes

type Enum: assistantToolsCodeType

Value	Description
code_interpreter

assistantToolsFileSearch

Name	Type	Description	Required
file_search	object	Overrides for the file search tool.	No
└─ max_num_results	integer	The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive. Note that the file search tool may output fewer than `max_num_results` results.	No
type	string	The type of tool being defined: `file_search`	Yes

type Enum: assistantToolsFileSearchType

Value	Description
file_search

assistantToolsFileSearchTypeOnly

Name	Type	Description	Required	Default
type	string	The type of tool being defined: `file_search`	Yes

type Enum: assistantToolsFileSearchType

Value	Description
file_search

assistantToolsFunction

Name	Type	Description	Required
function	object	The function definition.	Yes
└─ description	string	A description of what the function does, used by the model to choose when and how to call the function.	No
└─ name	string	The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	No
└─ parameters	chatCompletionFunctionParameters	The parameters the functions accepts, described as a JSON Schema object. See the guide/ for examples, and the JSON Schema reference for documentation about the format.	No
type	string	The type of tool being defined: `function`	Yes

type Enum: assistantToolsFunction

Value	Description
function

truncationObject

Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.

Name	Type	Description	Required	Default
last_messages	integer	The number of most recent messages from the thread when constructing the context for the run.	No
type	string	The truncation strategy to use for the thread. The default is `auto`. If set to `last_messages`, the thread will be truncated to the n most recent messages in the thread. When set to `auto`, messages in the middle of the thread will be dropped to fit the context length of the model, `max_prompt_tokens`.	Yes

type Enum: TruncationType

Value	Description
auto
last_messages

assistantsApiToolChoiceOption

Controls which (if any) tool is called by the model. none means the model will not call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

This component can be one of the following:

assistantsNamedToolChoice

assistantsNamedToolChoice

Specifies a tool the model should use. Use to force the model to call a specific tool.

Name	Type	Description	Required
function	object		No
└─ name	string	The name of the function to call.	No
type	string	The type of the tool. If type is `function`, the function name must be set	Yes

type Enum: AssistantsNamedToolChoiceType

Value	Description
function
code_interpreter
file_search

runObject

Represents an execution run on a threads.

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant used for execution of this run.	Yes
cancelled_at	integer	The Unix timestamp (in seconds) for when the run was cancelled.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the run was completed.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the run was created.	Yes
expires_at	integer	The Unix timestamp (in seconds) for when the run will expire.	Yes
failed_at	integer	The Unix timestamp (in seconds) for when the run failed.	Yes
id	string	The identifier, which can be referenced in API endpoints.	Yes
incomplete_details	object	Details on why the run is incomplete. Will be `null` if the run isn't incomplete.	Yes
└─ reason	string	The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run.	No
instructions	string	The instructions that the assistant used for this run.	Yes
last_error	object	The last error associated with this run. Will be `null` if there are no errors.	Yes
└─ code	string	One of `server_error` or `rate_limit_exceeded`.	No
└─ message	string	A human-readable description of the error.	No
max_completion_tokens	integer	The maximum number of completion tokens specified to have been used over the course of the run.	Yes
max_prompt_tokens	integer	The maximum number of prompt tokens specified to have been used over the course of the run.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
model	string	The model that the assistant used for this run.	Yes
object	string	The object type, which is always `thread.run`.	Yes
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
required_action	object	Details on the action required to continue the run. Will be `null` if no action is required.	Yes
└─ submit_tool_outputs	object	Details on the tool outputs needed for this run to continue.	No
└─ tool_calls	array	A list of the relevant tool calls.	No
└─ type	enum	For now, this is always `submit_tool_outputs`. Possible values: `submit_tool_outputs`	No
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	Yes
started_at	integer	The Unix timestamp (in seconds) for when the run was started.	Yes
status	string	The status of the run, which can be either `queued`, `in_progress`, `requires_action`, `cancelling`, `cancelled`, `failed`, `completed`, or `expired`.	Yes
temperature	number	The sampling temperature used for this run. If not set, defaults to 1.	No
thread_id	string	The ID of the threads that was executed on as a part of this run.	Yes
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model will not call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	Yes
tools	array	The list of tools that the assistant used for this run.	Yes	[]
top_p	number	The nucleus sampling value used for this run. If not set, defaults to 1.	No
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	Yes
usage	runCompletionUsage	Usage statistics related to the run. This value will be `null` if the run isn't in a terminal state (i.e. `in_progress`, `queued`, etc.).	Yes

object Enum: runObjectType

Value	Description
thread.run	The run object type which is always thread.run

status Enum: RunObjectStatus

Value	Description
queued	The queued state
in_progress	The in_progress state
requires_action	The required_action state
cancelling	The cancelling state
cancelled	The cancelled state
failed	The failed state
completed	The completed state
expired	The expired state

createRunRequest

Name	Type	Description	Required	Default
additional_instructions	string	Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.	No
additional_messages	array	Adds additional messages to the thread before creating the run.	No
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string	The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model will not call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No

listRunsResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

modifyRunRequest

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

submitToolOutputsRunRequest

Name	Type	Description	Required	Default
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
tool_outputs	array	A list of tools for which the outputs are being submitted.	Yes

runToolCallObject

Tool call objects

Name	Type	Description	Required
function	object	The function definition.	Yes
└─ arguments	string	The arguments that the model expects you to pass to the function.	No
└─ name	string	The name of the function.	No
id	string	The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the submit tool outputs to run endpointendpoint.	Yes
type	string	The type of tool call the output is required for. For now, this is always `function`.	Yes

type Enum: RunToolCallObjectType

Value	Description
function

createThreadAndRunRequest

Name	Type	Description	Required	Default
assistant_id	string	The ID of the assistant to use to execute this run.	Yes
instructions	string	Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis.	No
max_completion_tokens	integer	The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
max_prompt_tokens	integer	The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status `incomplete`. See `incomplete_details` for more info.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
model	string	The ID of the models to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.	No
parallel_tool_calls	ParallelToolCalls	Whether to enable parallel function calling during tool use.	No	True
response_format	assistantsApiResponseFormatOption	Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since `gpt-3.5-turbo-1106`. Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs which ensure the model will match your supplied JSON schema. Learn more in the Structured Outputs guide. Setting to `{ "type": "json_object" }` enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.	No
stream	boolean	If `true`, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a `data: [DONE]` message.	No
stream_options	chatCompletionStreamOptions	Options for streaming response. Only set this when you set `stream: true`.	No	None
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.	No	1
thread	createThreadRequest		No
tool_choice	assistantsApiToolChoiceOption	Controls which (if any) tool is called by the model. `none` means the model will not call any tools and instead generates a message. `auto` is the default value and means the model can pick between generating a message or calling a tool. Specifying a particular tool like `{"type": "file_search"}` or `{"type": "function", "function": {"name": "my_function"}}` forces the model to call that tool.	No
tool_resources	object	A set of resources that are used by the assistant's tools. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The ID of the vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.	No
tools	array	Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.	No
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both.	No	1
truncation_strategy	truncationObject	Controls for how a thread will be truncated prior to the run. Use this to control the initial context window of the run.	No

threadObject

Represents a thread that contains messages.

Name	Type	Description	Required	Default
created_at	integer	The Unix timestamp (in seconds) for when the thread was created.	Yes
id	string	The identifier, which can be referenced in API endpoints.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
object	string	The object type, which is always `thread`.	Yes
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	Yes
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.	No

object Enum: ThreadObjectType

Value	Description
thread	The type of thread object which is always `thread`

createThreadRequest

Name	Type	Description	Required	Default
messages	array	A list of messagesto start the thread with.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of file IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.	No
└─ vector_stores	array	A helper to create a vector store with file_ids and attach it to this thread. There can be a maximum of 1 vector store attached to the thread.	No

modifyThreadRequest

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
tool_resources	object	A set of resources that are made available to the assistant's tools in this thread. The resources are specific to the type of tool. For example, the `code_interpreter` tool requires a list of file IDs, while the `file_search` tool requires a list of vector store IDs.	No
└─ code_interpreter	object		No
└─ file_ids	array	A list of File IDs made available to the `code_interpreter` tool. There can be a maximum of 20 files associated with the tool.	No	[]
└─ file_search	object		No
└─ vector_store_ids	array	The vector store attached to this thread. There can be a maximum of 1 vector store attached to the thread.	No

deleteThreadResponse

Name	Type	Required
deleted	boolean	Yes
id	string	Yes
object	string	Yes

object Enum: DeleteThreadResponseObjectState

Value	Description
thread.deleted	The delete thread response object state which is `thread.deleted`

listThreadsResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

messageObject

Represents a message within a threads.

Name	Type	Description	Required
assistant_id	string	If applicable, the ID of the assistant that authored this message.	Yes
attachments	array	A list of files attached to the message, and the tools they were added to.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the message was completed.	Yes
content	array	The content of the message in array of text and/or images.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the message was created.	Yes
id	string	The identifier, which can be referenced in API endpoints.	Yes
incomplete_at	integer	The Unix timestamp (in seconds) for when the message was marked as incomplete.	Yes
incomplete_details	object	On an incomplete message, details about why the message is incomplete.	Yes
└─ reason	string	The reason the message is incomplete.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
object	string	The object type, which is always `thread.message`.	Yes
role	string	The entity that produced the message. One of `user` or `assistant`.	Yes
run_id	string	If applicable, the ID of the run associated with the authoring of this message.	Yes
status	string	The status of the message, which can be either `in_progress`, `incomplete`, or `completed`.	Yes
thread_id	string	The threads ID that this message belongs to.	Yes

object Enum: MessageObjectType

Value	Description
thread.message	The message object type which is `thread.message`

status Enum: MessageObjectStatus

Value	Description
in_progress
incomplete
completed

role Enum: MessageObjectRole

Value	Description
user
assistant

messageDeltaObject

Represents a message delta i.e. any changed fields on a message during streaming.

Name	Type	Description	Required
delta	object	The delta containing the fields that have changed on the Message.	Yes
└─ content	array	The content of the message in array of text and/or images.	No
└─ role	string	The entity that produced the message. One of `user` or `assistant`.	No
id	string	The identifier of the message, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.message.delta`.	Yes

object Enum: MessageDeltaObjectType

Value	Description
thread.message.delta

createMessageRequest

Name	Type	Description	Required
attachments	array	A list of files attached to the message, and the tools they should be added to.	No
content	string	The content of the message.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
role	string	The role of the entity that is creating the message. Allowed values include: - `user`: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages. - `assistant`: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.	Yes

role Enum: CreateMessageRequestRole

Value	Description
user
assistant

modifyMessageRequest

Name	Type	Description	Required	Default
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No

deleteMessageResponse

Name	Type	Required
deleted	boolean	Yes
id	string	Yes
object	string	Yes

object Enum: DeleteMessageResponseObject

Value	Description
thread.message.deleted	The delete message response object state

listMessagesResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

messageContentImageFileObject

References an image File in the content of a message.

Name	Type	Description	Required
image_file	object		Yes
└─ file_id	string	The File ID of the image in the message content.	No
type	string	Always `image_file`.	Yes

type Enum: MessageContentImageFileObjectType

Value	Description
image_file	The message content image file type

messageContentTextObject

The text content that is part of a message.

Name	Type	Description	Required
text	object		Yes
└─ annotations	array		No
└─ value	string	The data that makes up the text.	No
type	string	Always `text`.	Yes

type Enum: messageContentTextObjectType

Value	Description
text	The message content text Object type

messageContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files.

Name	Type	Description	Required
end_index	integer		Yes
file_citation	object		Yes
└─ file_id	string	The ID of the specific File the citation is from.	No
start_index	integer		Yes
text	string	The text in the message content that needs to be replaced.	Yes
type	string	Always `file_citation`.	Yes

type Enum: FileCitationObjectType

Value	Description
file_citation	The file citation object type

messageContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name	Type	Description	Required
end_index	integer		Yes
file_path	object		Yes
└─ file_id	string	The ID of the file that was generated.	No
start_index	integer		Yes
text	string	The text in the message content that needs to be replaced.	Yes
type	string	Always `file_path`.	Yes

type Enum: FilePathObjectType

Value	Description
file_path	The file path object type

messageDeltaContentImageFileObject

References an image File in the content of a message.

Name	Type	Description	Required
image_file	object		No
└─ file_id	string	The File ID of the image in the message content.	No
index	integer	The index of the content part in the message.	Yes
type	string	Always `image_file`.	Yes

type Enum: MessageDeltaContentImageFileObjectType

Value	Description
image_file

messageDeltaContentTextObject

The text content that is part of a message.

Name	Type	Description	Required
index	integer	The index of the content part in the message.	Yes
text	object		No
└─ annotations	array		No
└─ value	string	The data that makes up the text.	No
type	string	Always `text`.	Yes

type Enum: MessageDeltaContentTextObjectType

Value	Description
text

messageDeltaContentTextAnnotationsFileCitationObject

A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "file_search" tool to search files.

Name	Type	Description	Required
end_index	integer		No
file_citation	object		No
└─ file_id	string	The ID of the specific File the citation is from.	No
└─ quote	string	The specific quote in the file.	No
index	integer	The index of the annotation in the text content part.	Yes
start_index	integer		No
text	string	The text in the message content that needs to be replaced.	No
type	string	Always `file_citation`.	Yes

type Enum: MessageDeltaContentTextAnnotationsFileCitationObjectType

Value	Description
file_citation

messageDeltaContentTextAnnotationsFilePathObject

A URL for the file that's generated when the assistant used the code_interpreter tool to generate a file.

Name	Type	Description	Required
end_index	integer		No
file_path	object		No
└─ file_id	string	The ID of the file that was generated.	No
index	integer	The index of the annotation in the text content part.	Yes
start_index	integer		No
text	string	The text in the message content that needs to be replaced.	No
type	string	Always `file_path`.	Yes

type Enum: MessageDeltaContentTextAnnotationsFilePathObjectType

Value	Description
file_path

runStepObject

Represents a step in execution of a run.

Name	Type	Description	Required
assistant_id	string	The ID of the assistant associated with the run step.	Yes
cancelled_at	integer	The Unix timestamp (in seconds) for when the run step was cancelled.	Yes
completed_at	integer	The Unix timestamp (in seconds) for when the run step completed.	Yes
created_at	integer	The Unix timestamp (in seconds) for when the run step was created.	Yes
expired_at	integer	The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired.	Yes
failed_at	integer	The Unix timestamp (in seconds) for when the run step failed.	Yes
id	string	The identifier of the run step, which can be referenced in API endpoints.	Yes
last_error	object	The last error associated with this run step. Will be `null` if there are no errors.	Yes
└─ code	string	One of `server_error` or `rate_limit_exceeded`.	No
└─ message	string	A human-readable description of the error.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
object	string	The object type, which is always `assistant.run.step`.	Yes
run_id	string	The ID of the run that this run step is a part of.	Yes
status	string	The status of the run, which can be either `in_progress`, `cancelled`, `failed`, `completed`, or `expired`.	Yes
step_details	runStepDetailsMessageCreationObject or runStepDetailsToolCallsObject	The details of the run step.	Yes
thread_id	string	The ID of the threads that was run.	Yes
type	string	The type of run step, which can be either `message_creation` or `tool_calls`.	Yes

object Enum: RunStepObjectType

Value	Description
assistant.run.step	The object type, which is always `assistant.run.step`

type Enum: RunStepObjectType

Value	Description
message_creation	The message_creation run step
tool_calls	The tool_calls run step

status Enum: RunStepObjectStatus

Value	Description
in_progress	The in_progress run status
cancelled	The cancelled run status
failed	The cancelled run status
completed	The cancelled run status
expired	The cancelled run status

runStepDeltaObject

Represents a run step delta i.e. any changed fields on a run step during streaming.

Name	Type	Description	Required
delta	object	The delta containing the fields that have changed on the run step.	Yes
└─ step_details	runStepDeltaStepDetailsMessageCreationObject or runStepDeltaStepDetailsToolCallsObject	The details of the run step.	No
id	string	The identifier of the run step, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `thread.run.step.delta`.	Yes

object Enum: RunStepDeltaObjectType

Value	Description
thread.run.step.delta

listRunStepsResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

runStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name	Type	Description	Required
message_creation	object		Yes
└─ message_id	string	The ID of the message that was created by this run step.	No
type	string	Always `message_creation`.	Yes

type Enum: RunStepDetailsMessageCreationObjectType

Value	Description
message_creation

runStepDeltaStepDetailsMessageCreationObject

Details of the message creation by the run step.

Name	Type	Description	Required
message_creation	object		No
└─ message_id	string	The ID of the message that was created by this run step.	No
type	string	Always `message_creation`.	Yes

type Enum: RunStepDeltaStepDetailsMessageCreationObjectType

Value	Description
message_creation

runStepDetailsToolCallsObject

Details of the tool call.

Name	Type	Description	Required	Default
tool_calls	array	An array of tool calls the run step was involved in. These can be associated with one of three types of tools: `code_interpreter`, `retrieval` or `function`.	Yes
type	string	Always `tool_calls`.	Yes

type Enum: RunStepDetailsToolCallsObjectType

Value	Description
tool_calls

runStepDeltaStepDetailsToolCallsObject

Details of the tool call.

Name	Type	Description	Required	Default
tool_calls	array	An array of tool calls the run step was involved in. These can be associated with one of three types of tools: `code_interpreter`, `file_search` or `function`.	No
type	string	Always `tool_calls`.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsObjectType

Value	Description
tool_calls

runStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name	Type	Description	Required
code_interpreter	object	The Code Interpreter tool call definition.	Yes
└─ input	string	The input to the Code Interpreter tool call.	No
└─ outputs	array	The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (`logs`) or images (`image`). Each of these are represented by a different object type.	No
id	string	The ID of the tool call.	Yes
type	string	The type of tool call. This is always going to be `code_interpreter` for this type of tool call.	Yes

type Enum: RunStepDetailsToolCallsCodeObjectType

Value	Description
code_interpreter

runStepDeltaStepDetailsToolCallsCodeObject

Details of the Code Interpreter tool call the run step was involved in.

Name	Type	Description	Required
code_interpreter	object	The Code Interpreter tool call definition.	No
└─ input	string	The input to the Code Interpreter tool call.	No
└─ outputs	array	The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (`logs`) or images (`image`). Each of these are represented by a different object type.	No
id	string	The ID of the tool call.	No
index	integer	The index of the tool call in the tool calls array.	Yes
type	string	The type of tool call. This is always going to be `code_interpreter` for this type of tool call.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeObjectType

Value	Description
code_interpreter

runStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name	Type	Description	Required	Default
logs	string	The text output from the Code Interpreter tool call.	Yes
type	string	Always `logs`.	Yes

type Enum: RunStepDetailsToolCallsCodeOutputLogsObjectType

Value	Description
logs

runStepDeltaStepDetailsToolCallsCodeOutputLogsObject

Text output from the Code Interpreter tool call as part of a run step.

Name	Type	Description	Required
index	integer	The index of the output in the outputs array.	Yes
logs	string	The text output from the Code Interpreter tool call.	No
type	string	Always `logs`.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputLogsObjectType

Value	Description
logs

runStepDetailsToolCallsCodeOutputImageObject

Name	Type	Description	Required
image	object		Yes
└─ file_id	string	The File ID of the image.	No
type	string	Always `image`.	Yes

type Enum: RunStepDetailsToolCallsCodeOutputImageObjectType

Value	Description
image

runStepDeltaStepDetailsToolCallsCodeOutputImageObject

Name	Type	Description	Required
image	object		No
└─ file_id	string	The file ID of the image.	No
index	integer	The index of the output in the outputs array.	Yes
type	string	Always `image`.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsCodeOutputImageObject

Value	Description
image

runStepDetailsToolCallsFileSearchObject

Name	Type	Description	Required
file_search	object	For now, this is always going to be an empty object.	Yes
└─ results	array	The results of the file search.	No
id	string	The ID of the tool call object.	Yes
type	string	The type of tool call. This is always going to be `file_search` for this type of tool call.	Yes

type Enum: RunStepDetailsToolCallsFileSearchObjectType

Value	Description
file_search

runStepDetailsToolCallsFileSearchResultObject

A result instance of the file search.

Name	Type	Description	Required
content	array	The content of the result that was found. The content is only included if requested via the include query parameter.	No
file_id	string	The ID of the file that result was found in.	Yes
file_name	string	The name of the file that result was found in.	Yes
score	number	The score of the result. All values must be a floating point number between 0 and 1.	Yes

runStepDeltaStepDetailsToolCallsFileSearchObject

Name	Type	Description	Required
file_search	object	For now, this is always going to be an empty object.	Yes
id	string	The ID of the tool call object.	No
index	integer	The index of the tool call in the tool calls array.	Yes
type	string	The type of tool call. This is always going to be `retrieval` for this type of tool call.	Yes

type Enum: RunStepDeltaStepDetailsToolCallsFileSearchObjectType

Value	Description
file_search

runStepDetailsToolCallsFunctionObject

Name	Type	Description	Required
function	object	The definition of the function that was called.	Yes
└─ arguments	string	The arguments passed to the function.	No
└─ name	string	The name of the function.	No
└─ output	string	The output of the function. This will be `null` if the outputs have not been submitted yet.	No
id	string	The ID of the tool call object.	Yes
type	string	The type of tool call. This is always going to be `function` for this type of tool call.	Yes

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value	Description
function

runStepDeltaStepDetailsToolCallsFunctionObject

Name	Type	Description	Required
function	object	The definition of the function that was called.	No
└─ arguments	string	The arguments passed to the function.	No
└─ name	string	The name of the function.	No
└─ output	string	The output of the function. This will be `null` if the outputs have not been submitted yet.	No
id	string	The ID of the tool call object.	No
index	integer	The index of the tool call in the tool calls array.	Yes
type	string	The type of tool call. This is always going to be `function` for this type of tool call.	Yes

type Enum: RunStepDetailsToolCallsFunctionObjectType

Value	Description
function

vectorStoreExpirationAfter

The expiration policy for a vector store.

Name	Type	Description	Required	Default
anchor	string	Anchor timestamp after which the expiration policy applies. Supported anchors: `last_active_at`.	Yes
days	integer	The number of days after the anchor time that the vector store will expire.	Yes

anchor Enum: VectorStoreExpirationAfterAnchor

Value	Description
last_active_at	The anchor timestamp after which the expiration policy applies.

vectorStoreObject

A vector store is a collection of processed files can be used by the file_search tool.

Name	Type	Description	Required
created_at	integer	The Unix timestamp (in seconds) for when the vector store was created.	Yes
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
expires_at	integer	The Unix timestamp (in seconds) for when the vector store will expire.	No
file_counts	object		Yes
└─ cancelled	integer	The number of files that were cancelled.	No
└─ completed	integer	The number of files that have been successfully processed.	No
└─ failed	integer	The number of files that have failed to process.	No
└─ in_progress	integer	The number of files that are currently being processed.	No
└─ total	integer	The total number of files.	No
id	string	The identifier, which can be referenced in API endpoints.	Yes
last_active_at	integer	The Unix timestamp (in seconds) for when the vector store was last active.	Yes
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	Yes
name	string	The name of the vector store.	Yes
object	enum	The object type, which is always `vector_store`. Possible values: `vector_store`	Yes
status	string	The status of the vector store, which can be either `expired`, `in_progress`, or `completed`. A status of `completed` indicates that the vector store is ready for use.	Yes
usage_bytes	integer	The total number of bytes used by the files in the vector store.	Yes

status Enum: VectorStoreObjectStatus

Value	Description
expired
in_progress
completed

createVectorStoreRequest

Name	Type	Description	Required
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is non-empty.	No
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
file_ids	array	A list of file IDs that the vector store should use. Useful for tools like `file_search` that can access files.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
name	string	The name of the vector store.	No

updateVectorStoreRequest

Name	Type	Description	Required
expires_after	vectorStoreExpirationAfter	The expiration policy for a vector store.	No
metadata	object	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.	No
name	string	The name of the vector store.	No

listVectorStoresResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

deleteVectorStoreResponse

Name	Type	Required
deleted	boolean	Yes
id	string	Yes
object	string	Yes

object Enum: DeleteVectorStoreResponseObject

Value	Description
vector_store.deleted	The delete vector store response object state

vectorStoreFileObject

A list of files attached to a vector store.

Name	Type	Description	Required
chunking_strategy	autoChunkingStrategyRequestParam or staticChunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy. Only applicable if `file_ids` is non-empty.	No
created_at	integer	The Unix timestamp (in seconds) for when the vector store file was created.	Yes
id	string	The identifier, which can be referenced in API endpoints.	Yes
last_error	object	The last error associated with this vector store file. Will be `null` if there are no errors.	Yes
└─ code	string	One of `server_error` or `invalid_file` or `unsupported_file`.	No
└─ message	string	A human-readable description of the error.	No
object	string	The object type, which is always `vector_store.file`.	Yes
status	string	The status of the vector store file, which can be either `in_progress`, `completed`, `cancelled`, or `failed`. The status `completed` indicates that the vector store file is ready for use.	Yes
usage_bytes	integer	The total vector store usage in bytes. Note that this may be different from the original file size.	Yes
vector_store_id	string	The ID of the vector store that the file is attached to.	Yes

object Enum: VectorStoreFileObjectType

Value	Description
vector_store.file

status Enum: VectorStoreFileObjectStatus

Value	Description
in_progress
completed
cancelled
failed

otherChunkingStrategyResponseParam

This is returned when the chunking strategy is unknown. Typically, this is because the file was indexed before the chunking_strategy concept was introduced in the API.

Name	Type	Description	Required	Default
type	string	Always `other`.	Yes

type Enum: OtherChunkingStrategyResponseParamType

Value	Description
other

staticChunkingStrategyResponseParam

Name	Type	Description	Required	Default
static	staticChunkingStrategy		Yes
type	string	Always `static`.	Yes

type Enum: StaticChunkingStrategyResponseParamType

Value	Description
static

staticChunkingStrategy

Name	Type	Description	Required	Default
chunk_overlap_tokens	integer	The number of tokens that overlap between chunks. The default value is `400`. Note that the overlap must not exceed half of `max_chunk_size_tokens`.	Yes
max_chunk_size_tokens	integer	The maximum number of tokens in each chunk. The default value is `800`. The minimum value is `100` and the maximum value is `4096`.	Yes

autoChunkingStrategyRequestParam

The default strategy. This strategy currently uses a max_chunk_size_tokens of 800 and chunk_overlap_tokens of 400.

Name	Type	Description	Required	Default
type	enum	Always `auto`. Possible values: `auto`	Yes

staticChunkingStrategyRequestParam

Name	Type	Description	Required	Default
static	staticChunkingStrategy		Yes
type	enum	Always `static`. Possible values: `static`	Yes

chunkingStrategyRequestParam

The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.

This component can be one of the following:

autoChunkingStrategyRequestParam
staticChunkingStrategyRequestParam

createVectorStoreFileRequest

Name	Type	Description	Required	Default
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No
file_id	string	A File ID that the vector store should use. Useful for tools like `file_search` that can access files.	Yes

listVectorStoreFilesResponse

Name	Type	Required
data	array	Yes
first_id	string	Yes
has_more	boolean	Yes
last_id	string	Yes
object	string	Yes

deleteVectorStoreFileResponse

Name	Type	Required
deleted	boolean	Yes
id	string	Yes
object	string	Yes

object Enum: DeleteVectorStoreFileResponseObject

Value	Description
vector_store.file.deleted

vectorStoreFileBatchObject

A batch of files attached to a vector store.

Name	Type	Description	Required
created_at	integer	The Unix timestamp (in seconds) for when the vector store files batch was created.	Yes
file_counts	object		Yes
└─ cancelled	integer	The number of files that were cancelled.	No
└─ completed	integer	The number of files that have been processed.	No
└─ failed	integer	The number of files that have failed to process.	No
└─ in_progress	integer	The number of files that are currently being processed.	No
└─ total	integer	The total number of files.	No
id	string	The identifier, which can be referenced in API endpoints.	Yes
object	string	The object type, which is always `vector_store.file_batch`.	Yes
status	string	The status of the vector store files batch, which can be either `in_progress`, `completed`, `cancelled` or `failed`.	Yes
vector_store_id	string	The ID of the vector store that the File is attached to.	Yes

object Enum: VectorStoreFileBatchObjectType

Value	Description
vector_store.files_batch

status Enum: VectorStoreFileBatchObjectStatus

Value	Description
in_progress
completed
cancelled
failed

createVectorStoreFileBatchRequest

Name	Type	Description	Required	Default
chunking_strategy	chunkingStrategyRequestParam	The chunking strategy used to chunk the file(s). If not set, will use the `auto` strategy.	No
file_ids	array	A list of File IDs that the vector store should use. Useful for tools like `file_search` that can access files.	Yes

assistantStreamEvent

Represents an event emitted when streaming a Run.

Each event in a server-sent events stream has an event and data property:

event: thread.created
data: {"id": "thread_123", "object": "thread", ...}

We emit events whenever a new object is created, transitions to a new state, or is being streamed in parts (deltas). For example, we emit thread.run.created when a new run is created, thread.run.completed when a run completes, and so on. When an Assistant chooses to create a message during a run, we emit a thread.message.created event, a thread.message.in_progress event, many thread.message.delta events, and finally a thread.message.completed event.

We may add additional events over time, so we recommend handling unknown events gracefully in your code.

This component can be one of the following:

threadStreamEvent
runStreamEvent
runStepStreamEvent
messageStreamEvent
errorEvent
doneEvent

threadStreamEvent

This component can be one of the following:

thread.created

Occurs when a new thread is created.

Name	Type	Description	Required	Default
data	threadObject	Represents a thread that contains messages.	Yes
event	string		Yes

Data: threadObject

Event Enum: ThreadStreamEventEnum

Value	Description
thread.created	The thread created event

runStreamEvent

This component can be one of the following:

thread.run.created

Occurs when a new run is created.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventCreated

Value	Description
thread.run.created

thread.run.queued

Occurs when a run moves to a queued status.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventQueued

Value	Description
thread.run.queued

thread.run.in_progress

Occurs when a run moves to an in_progress status.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventInProgress

Value	Description
thread.run.in_progress

thread.run.requires_action

Occurs when a run moves to a requires_action status.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventRequiresAction

Value	Description
thread.run.requires_action

thread.run.completed

Occurs when a run is completed.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventCompleted

Value	Description
thread.run.completed

thread.run.failed

Occurs when a run fails.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventFailed

Value	Description
thread.run.failed

thread.run.cancelling

Occurs when a run moves to a cancelling status.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventCancelling

Value	Description
thread.run.cancelling

thread.run.cancelled

Occurs when a run is cancelled.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventCancelled

Value	Description
thread.run.cancelled

thread.run.expired

Occurs when a run expires.

Name	Type	Description	Required	Default
data	runObject	Represents an execution run on a threads.	Yes
event	string		Yes

Data: runObject

Event Enum: RunStreamEventExpired

Value	Description
thread.run.expired

runStepStreamEvent

This component can be one of the following:

thread.run.step.created

Occurs when a run step is created.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventCreated

Value	Description
thread.run.step.created

thread.run.step.in_progress

Occurs when a run step moves to an in_progress state.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventInProgress

Value	Description
thread.run.step.in_progress

thread.run.step.delta

Occurs when parts of a run step are being streamed.

Name	Type	Description	Required	Default
data	runStepDeltaObject	Represents a run step delta i.e. any changed fields on a run step during streaming.	Yes
event	string		Yes

Data: runStepDeltaObject

Event Enum: RunStepStreamEventDelta

Value	Description
thread.run.step.delta

thread.run.step.completed

Occurs when a run step is completed.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventCompleted

Value	Description
thread.run.step.completed

thread.run.step.failed

Occurs when a run step fails.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventFailed

Value	Description
thread.run.step.failed

thread.run.step.cancelled

Occurs when a run step is cancelled.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventCancelled

Value	Description
thread.run.step.cancelled

thread.run.step.expired

Occurs when a run step expires.

Name	Type	Description	Required	Default
data	runStepObject	Represents a step in execution of a run.	Yes
event	string		Yes

Data: runStepObject

Event Enum: RunStepStreamEventExpired

Value	Description
thread.run.step.expired

messageStreamEvent

This component can be one of the following:

thread.message.created

Occurs when a message is created.

Name	Type	Description	Required	Default
data	messageObject	Represents a message within a threads.	Yes
event	string		Yes

Data: messageObject

Event Enum: MessageStreamEventCreated

Value	Description
thread.message.created

thread.message.in_progress

Occurs when a message moves to an in_progress state.

Name	Type	Description	Required	Default
data	messageObject	Represents a message within a threads.	Yes
event	string		Yes

Data: messageObject

Event Enum: MessageStreamEventInProgress

Value	Description
thread.message.in_progress

thread.message.delta

Occurs when parts of a message are being streamed.

Name	Type	Description	Required	Default
data	messageDeltaObject	Represents a message delta i.e. any changed fields on a message during streaming.	Yes
event	string		Yes

Data: messageDeltaObject

Event Enum: MessageStreamEventDelta

Value	Description
thread.message.delta

thread.message.completed

Occurs when a message is completed.

Name	Type	Description	Required	Default
data	messageObject	Represents a message within a threads.	Yes
event	string		Yes

Data: messageObject

Event Enum: MessageStreamEventCompleted

Value	Description
thread.message.completed

thread.message.incomplete

Occurs when a message ends before it is completed.

Name	Type	Description	Required	Default
data	messageObject	Represents a message within a threads.	Yes
event	string		Yes

Data: messageObject

Event Enum: MessageStreamEventIncomplete

Value	Description
thread.message.incomplete

Annotation

This component can be one of the following:

FileCitation
UrlCitation
FilePath

Click

A click action.

Name	Type	Description	Required
button	enum	Indicates which mouse button was pressed during the click. One of `left`, `right`, `wheel`, `back`, or `forward`. Possible values: `left`, `right`, `wheel`, `back`, `forward`	Yes
type	enum	Specifies the event type. For a click action, this property is always set to `click`. Possible values: `click`	Yes
x	integer	The x-coordinate where the click occurred.	Yes
y	integer	The y-coordinate where the click occurred.	Yes

CodeInterpreterFileOutput

The output of a code interpreter tool call that is a file.

Name	Type	Description	Required	Default
files	array		Yes
type	enum	The type of the code interpreter file output. Always `files`. Possible values: `files`	Yes

CodeInterpreterTextOutput

The output of a code interpreter tool call that is text.

Name	Type	Description	Required	Default
logs	string	The logs of the code interpreter tool call.	Yes
type	enum	The type of the code interpreter text output. Always `logs`. Possible values: `logs`	Yes

CodeInterpreterTool

A tool that runs code.

Name	Type	Description	Required	Default
file_ids	array	The IDs of the files to run the code on.	Yes
type	enum	The type of the code interpreter tool. Always `code_interpreter`. Possible values: `code_interpreter`	Yes

CodeInterpreterToolCall

A tool call to run code.

Name	Type	Description	Required
code	string	The code to run.	Yes
id	string	The unique ID of the code interpreter tool call.	Yes
results	array	The results of the code interpreter tool call.	Yes
status	enum	The status of the code interpreter tool call. Possible values: `in_progress`, `interpreting`, `completed`	Yes
type	enum	The type of the code interpreter tool call. Always `code_interpreter_call`. Possible values: `code_interpreter_call`	Yes

CodeInterpreterToolOutput

This component can be one of the following:

CodeInterpreterTextOutput
CodeInterpreterFileOutput

ComparisonFilter

A filter used to compare a specified attribute key to a given value using a defined comparison operation.

Name	Type	Description	Required
key	string	The key to compare against the value.	Yes
type	enum	Specifies the comparison operator: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`. - `eq`: equals - `ne`: not equal - `gt`: greater than - `gte`: greater than or equal - `lt`: less than - `lte`: less than or equal Possible values: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`	Yes
value	string or number or boolean	The value to compare against the attribute key; supports string, number, or boolean types.	Yes

CompoundFilter

Combine multiple filters using and or or.

Name	Type	Description	Required	Default
filters	array	Array of filters to combine. Items can be `ComparisonFilter` or `CompoundFilter`.	Yes
type	enum	Type of operation: `and` or `or`. Possible values: `and`, `or`	Yes

ComputerAction

This component can be one of the following:

Click
DoubleClick
Drag
KeyPress
Move
Screenshot
Scroll
Type
Wait

ComputerScreenshotImage

A computer screenshot image used with the computer use tool.

Name	Type	Description	Required
file_id	string	The identifier of an uploaded file that contains the screenshot.	No
image_url	string	The URL of the screenshot image.	No
type	enum	Specifies the event type. For a computer screenshot, this property is always set to `computer_screenshot`. Possible values: `computer_screenshot`	Yes

ComputerTool

A tool that controls a virtual computer.

Name	Type	Description	Required
display_height	number	The height of the computer display.	Yes
display_width	number	The width of the computer display.	Yes
environment	enum	The type of computer environment to control. Possible values: `mac`, `windows`, `ubuntu`, `browser`	Yes
type	enum	The type of the computer use tool. Always `computer_use_preview`. Possible values: `computer-use-preview`	Yes

ComputerToolCall

A tool call to a computer use tool.

Name	Type	Description	Required
action	ComputerAction		Yes
call_id	string	An identifier used when responding to the tool call with output.	Yes
id	string	The unique ID of the computer call.	Yes
pending_safety_checks	array	The pending safety checks for the computer call.	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	Yes
type	enum	The type of the computer call. Always `computer_call`. Possible values: `computer_call`	Yes

ComputerToolCallOutput

The output of a computer tool call.

Name	Type	Description	Required
acknowledged_safety_checks	array	The safety checks reported by the API that have been acknowledged by the developer.	No
call_id	string	The ID of the computer tool call that produced the output.	Yes
id	string	The ID of the computer tool call output.	No
output	ComputerScreenshotImage	A computer screenshot image used with the computer use tool.	Yes
status	enum	The status of the message input. One of `in_progress`, `completed`, or `incomplete`. Populated when input items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the computer tool call output. Always `computer_call_output`. Possible values: `computer_call_output`	Yes

ComputerToolCallOutputResource

Name	Type	Description	Required
acknowledged_safety_checks	array	The safety checks reported by the API that have been acknowledged by the developer.	No
call_id	string	The ID of the computer tool call that produced the output.	Yes
id	string	The unique ID of the computer call tool output.	Yes
output	ComputerScreenshotImage	A computer screenshot image used with the computer use tool.	Yes
status	enum	The status of the message input. One of `in_progress`, `completed`, or `incomplete`. Populated when input items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the computer tool call output. Always `computer_call_output`. Possible values: `computer_call_output`	Yes

ComputerToolCallSafetyCheck

A pending safety check for the computer call.

Name	Type	Description	Required
code	string	The type of the pending safety check.	Yes
id	string	The ID of the pending safety check.	Yes
message	string	Details about the pending safety check.	Yes

Content

Multi-modal input and output contents.

This component can be one of the following:

InputContent
OutputContent

Coordinate

An x/y coordinate pair, e.g. { x: 100, y: 200 }.

Name	Type	Description	Required	Default
x	integer	The x-coordinate.	Yes
y	integer	The y-coordinate.	Yes

CreateModelResponseProperties

Name	Type	Description	Required	Default
metadata	Metadata	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.	No
model	string	Model used to generate the responses.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. .	No

createResponse

Name	Type	Description	Required	Default
include	array	{"$ref": "#/components/schemas/includable/description"}	No
input	string or array	Text, image, or file inputs to the model, used to generate a response.	Yes
instructions	string	Inserts a system (or developer) message as the first item in the model's context. When using along with `previous_response_id`, the instructions from a previous response will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses.	No
max_output_tokens	integer	An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.	No
parallel_tool_calls	boolean	Whether to allow the model to run tool calls in parallel.	No	True
previous_response_id	string	The unique ID of the previous response to the model. Use this to create multi-turn conversations.	No
reasoning	Reasoning	o-series models only Configuration options for reasoning models.	No
store	boolean	Whether to store the generated model response for later retrieval via API.	No	True
stream	boolean	If set to true, the model response data will be streamed to the client as it is generated using server-sent events.	No	False
text	object	Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - text inputs and outputs - Structured Outputs	No
└─ format	TextResponseFormatConfiguration	An object specifying the format that the model must output. Configuring `{ "type": "json_schema" }` enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is `{ "type": "text" }` with no additional options. Not recommended for gpt-4o and newer models: Setting to `{ "type": "json_object" }` enables the older JSON mode, which ensures the message the model generates is valid JSON. Using `json_schema` is preferred for models that support it.	No
tool_choice	ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction	How the model should select which tool (or tools) to use when generating a response. See the `tools` parameter to see how to specify which tools the model can call.	No
tools	array	An array of tools the model may call while generating a response. You can specify which tool to use by setting the `tool_choice` parameter. The two categories of tools you can provide the model are: - Built-in tools	No
truncation	enum	The truncation strategy to use for the model response. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: `auto`, `disabled`	No

DoubleClick

A double click action.

Name	Type	Description	Required
type	enum	Specifies the event type. For a double click action, this property is always set to `double_click`. Possible values: `double_click`	Yes
x	integer	The x-coordinate where the double click occurred.	Yes
y	integer	The y-coordinate where the double click occurred.	Yes

Drag

A drag action.

Name	Type	Description	Required	Default
path	array	An array of coordinates representing the path of the drag action. Coordinates will appear as an array of objects, eg`[{ x: 100, y: 200 }, { x: 200, y: 300 }]`	Yes
type	enum	Specifies the event type. For a drag action, this property is always set to `drag`. Possible values: `drag`	Yes

EasyInputMessage

A message input to the model with a role indicating instruction following hierarchy. Instructions given with the developer or system role take precedence over instructions given with the user role. Messages with the assistant role are presumed to have been generated by the model in previous interactions.

Name	Type	Description	Required
content	string or InputMessageContentList	Text, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.	Yes
role	enum	The role of the message input. One of `user`, `assistant`, `system`, or `developer`. Possible values: `user`, `assistant`, `system`, `developer`	Yes
type	enum	The type of the message input. Always `message`. Possible values: `message`	No

FileCitation

A citation to a file.

Name	Type	Description	Required
file_id	string	The ID of the file.	Yes
index	integer	The index of the file in the list of files.	Yes
type	enum	The type of the file citation. Always `file_citation`. Possible values: `file_citation`	Yes

FilePath

A path to a file.

Name	Type	Description	Required
file_id	string	The ID of the file.	Yes
index	integer	The index of the file in the list of files.	Yes
type	enum	The type of the file path. Always `file_path`. Possible values: `file_path`	Yes

FileSearchRanker

The ranker to use for the file search. If not specified will use the auto ranker.

Property	Value
Description	The ranker to use for the file search. If not specified will use the `auto` ranker.
Type	string
Values	`auto` `default_2024_08_21`

FileSearchTool

A tool that searches for relevant content from uploaded files.

Name	Type	Description	Required	Default
filters	ComparisonFilter or CompoundFilter	A filter to apply based on file attributes.	No
max_num_results	integer	The maximum number of results to return. This number should be between 1 and 50 inclusive.	No
ranking_options	object	Ranking options for search.	No
└─ ranker	enum	The ranker to use for the file search. Possible values: `auto`, `default-2024-11-15`	No
└─ score_threshold	number	The score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.	No	0
type	enum	The type of the file search tool. Always `file_search`. Possible values: `file_search`	Yes
vector_store_ids	array	The IDs of the vector stores to search.	Yes

FileSearchToolCall

The results of a file search tool call.

Name	Type	Description	Required
id	string	The unique ID of the file search tool call.	Yes
queries	array	The queries used to search for files.	Yes
results	array	The results of the file search tool call.	No
status	enum	The status of the file search tool call. One of `in_progress`, `searching`, `incomplete` or `failed`, Possible values: `in_progress`, `searching`, `completed`, `incomplete`, `failed`	Yes
type	enum	The type of the file search tool call. Always `file_search_call`. Possible values: `file_search_call`	Yes

FunctionTool

Defines a function in your own code the model can choose to call.

Name	Type	Description	Required
description	string	A description of the function. Used by the model to determine whether or not to call the function.	No
name	string	The name of the function to call.	Yes
parameters	object	A JSON schema object describing the parameters of the function.	Yes
strict	boolean	Whether to enforce strict parameter validation. Default `true`.	Yes
type	enum	The type of the function tool. Always `function`. Possible values: `function`	Yes

FunctionToolCall

A tool call to run a function.

Name	Type	Description	Required
arguments	string	A JSON string of the arguments to pass to the function.	Yes
call_id	string	The unique ID of the function tool call generated by the model.	Yes
id	string	The unique ID of the function tool call.	Yes
name	string	The name of the function to run.	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the function tool call. Always `function_call`. Possible values: `function_call`	Yes

FunctionToolCallOutput

The output of a function tool call.

Name	Type	Description	Required
call_id	string	The unique ID of the function tool call generated by the model.	Yes
id	string	The unique ID of the function tool call output. Populated when this item is returned via API.	No
output	string	A JSON string of the output of the function tool call.	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the function tool call output. Always `function_call_output`. Possible values: `function_call_output`	Yes

FunctionToolCallOutputResource

Name	Type	Description	Required
call_id	string	The unique ID of the function tool call generated by the model.	Yes
id	string	The unique ID of the function call tool output.	Yes
output	string	A JSON string of the output of the function tool call.	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the function tool call output. Always `function_call_output`. Possible values: `function_call_output`	Yes

includable

Specify additional output data to include in the model response. Currently supported values are:

file_search_call.results: Include the search results of the file search tool call.
message.input_image.image_url: Include image urls from the input message.
computer_call_output.output.image_url: Include image urls from the computer call output.

Property	Value
Description	Specify additional output data to include in the model response. Currently supported values are: - `file_search_call.results`: Include the search results of the file search tool call. - `message.input_image.image_url`: Include image urls from the input message. - `computer_call_output.output.image_url`: Include image urls from the computer call output.
Type	string
Values	`file_search_call.results` `message.input_image.image_url` `computer_call_output.output.image_url`

InputAudio

An audio input to the model.

Name	Type	Description	Required
data	string	Base64-encoded audio data.	Yes
format	enum	The format of the audio data. Currently supported formats are `mp3` and `wav`. Possible values: `mp3`, `wav`	Yes
type	enum	The type of the input item. Always `input_audio`. Possible values: `input_audio`	Yes

InputContent

This component can be one of the following:

InputText
InputImage
InputFile

InputFile

A file input to the model.

Name	Type	Description	Required
file_data	string	The content of the file to be sent to the model.	No
file_id	string	The ID of the file to be sent to the model.	No
filename	string	The name of the file to be sent to the model.	No
type	enum	The type of the input item. Always `input_file`. Possible values: `input_file`	Yes

InputImage

An image input to the model.

Name	Type	Description	Required
detail	enum	The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`. Possible values: `high`, `low`, `auto`	Yes
file_id	string	The ID of the file to be sent to the model.	No
image_url	string	The URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.	No
type	enum	The type of the input item. Always `input_image`. Possible values: `input_image`	Yes

InputItem

This component can be one of the following:

EasyInputMessage
Item
ItemReference

InputMessage

Name	Type	Description	Required
content	InputMessageContentList	A list of one or many input items to the model, containing different content types.	Yes
role	enum	The role of the message input. One of `user`, `system`, or `developer`. Possible values: `user`, `system`, `developer`	Yes
status	enum	The status of item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the message input. Always set to `message`. Possible values: `message`	No

InputMessageContentList

A list of one or many input items to the model, containing different content types.

No properties defined for this component.

InputMessageResource

Name	Type	Description	Required
content	InputMessageContentList	A list of one or many input items to the model, containing different content types.	Yes
id	string	The unique ID of the message input.	Yes
role	enum	The role of the message input. One of `user`, `system`, or `developer`. Possible values: `user`, `system`, `developer`	Yes
status	enum	The status of item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the message input. Always set to `message`. Possible values: `message`	No

InputText

A text input to the model.

Name	Type	Description	Required	Default
text	string	The text input to the model.	Yes
type	enum	The type of the input item. Always `input_text`. Possible values: `input_text`	Yes

Item

Content item used to generate a response.

This component can be one of the following:

InputMessage
OutputMessage
FileSearchToolCall
ComputerToolCall
ComputerToolCallOutput
FunctionToolCall
FunctionToolCallOutput
ReasoningItem

ItemReference

An internal identifier for an item to reference.

Name	Type	Description	Required	Default
id	string	The ID of the item to reference.	Yes
type	enum	The type of item to reference. Always `item_reference`. Possible values: `item_reference`	Yes

ItemResource

Content item used to generate a response.

This component can be one of the following:

InputMessageResource
OutputMessage
FileSearchToolCall
ComputerToolCall
ComputerToolCallOutputResource
FunctionToolCall
FunctionToolCallOutputResource

KeyPress

A collection of keypresses the model would like to perform.

Name	Type	Description	Required	Default
keys	array	The combination of keys the model is requesting to be pressed. This is an array of strings, each representing a key.	Yes
type	enum	Specifies the event type. For a keypress action, this property is always set to `keypress`. Possible values: `keypress`	Yes

Metadata

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

No properties defined for this component.

ModelResponseProperties

Name	Type	Description	Required	Default
metadata	Metadata	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.	No
model	string	Model used to generate the responses.	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	No	1
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	No	1
user	string	A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. .	No

Move

A mouse move action.

Name	Type	Description	Required
type	enum	Specifies the event type. For a move action, this property is always set to `move`. Possible values: `move`	Yes
x	integer	The x-coordinate to move to.	Yes
y	integer	The y-coordinate to move to.	Yes

OutputAudio

An audio output from the model.

Name	Type	Description	Required
data	string	Base64-encoded audio data from the model.	Yes
transcript	string	The transcript of the audio data from the model.	Yes
type	enum	The type of the output audio. Always `output_audio`. Possible values: `output_audio`	Yes

OutputContent

This component can be one of the following:

OutputText
Refusal

OutputItem

Name	Type	Description	Required
action	ComputerAction		Yes
arguments	string	A JSON string of the arguments to pass to the function.	Yes
call_id	string	An identifier used when responding to the tool call with output.	Yes
content	array	Reasoning text contents.	Yes
id	string	The unique identifier of the reasoning content.	Yes
name	string	The name of the function to run.	Yes
pending_safety_checks	array	The pending safety checks for the computer call.	Yes
queries	array	The queries used to search for files.	Yes
results	array	The results of the file search tool call.	No
role	enum	The role of the output message. Always `assistant`. Possible values: `assistant`	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	Yes
type	enum	The type of the object. Always `reasoning`. Possible values: `reasoning`	Yes

OutputMessage

An output message from the model.

Name	Type	Description	Required
content	array	The content of the output message.	Yes
id	string	The unique ID of the output message.	Yes
role	enum	The role of the output message. Always `assistant`. Possible values: `assistant`	Yes
status	enum	The status of the message input. One of `in_progress`, `completed`, or `incomplete`. Populated when input items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	Yes
type	enum	The type of the output message. Always `message`. Possible values: `message`	Yes

OutputText

A text output from the model.

Name	Type	Description	Required
annotations	array	The annotations of the text output.	Yes
text	string	The text output from the model.	Yes
type	enum	The type of the output text. Always `output_text`. Possible values: `output_text`	Yes

RealtimeSessionCreateRequest

Realtime session object configuration.

Name	Type	Description	Required	Default
input_audio_format	enum	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, input audio must be 16-bit PCM at a 24kHz sample rate, single channel (mono), and little-endian byte order. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
input_audio_noise_reduction	object	Configuration for input audio noise reduction. This can be set to `null` to turn off. Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model. Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.	No
└─ type	enum	Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones. Possible values: `near_field`, `far_field`	No
input_audio_transcription	object	Configuration for input audio transcription, defaults to off and can be set to `null` to turn off once on. Input audio transcription isn't native to the model, since the model consumes audio directly. Transcription runs asynchronously through the Transcriptions endpoint and should be treated as guidance of input audio content rather than precisely what the model heard. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.	No
└─ language	string	The language of the input audio. Supplying the input language in ISO-639-1 (e.g. `en`) format will improve accuracy and latency.	No
└─ model	string	The model to use for transcription, current options are `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, and `whisper-1`.	No
└─ prompt	string	An optional text to guide the model's style or continue a previous audio segment. For `whisper-1`, the prompt is a list of keywords. For `gpt-4o-transcribe` models, the prompt is a free text string, for example "expect words related to technology".	No
instructions	string	The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior. Note that the server sets default instructions which will be used if this field isn't set and are visible in the `session.created` event at the start of the session.	No
max_response_output_tokens	integer or string	Maximum number of output tokens for a single assistant response, inclusive of tool calls. Provide an integer between 1 and 4096 to limit output tokens, or `inf` for the maximum available tokens for a given model. Defaults to `inf`.	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
model	string	The name of the deployment used for this session.	No
output_audio_format	enum	The format of output audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, output audio is sampled at a rate of 24kHz. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
temperature	number	Sampling temperature for the model, limited to [0.6, 1.2]. For audio models a temperature of 0.8 is highly recommended for best performance.	No	0.8
tool_choice	string	How the model chooses tools. Options are `auto`, `none`, `required`, or specify a function.	No	auto
tools	array	Tools (functions) available to the model.	No
turn_detection	object	Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech. Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with `uhhm`, the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.	No
└─ create_response	boolean	Whether or not to automatically generate a response when a VAD stop event occurs.	No	True
└─ eagerness	enum	Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. Possible values: `low`, `medium`, `high`, `auto`	No
└─ interrupt_response	boolean	Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e. `conversation` of `auto`) when a VAD start event occurs.	No	True
└─ prefix_padding_ms	integer	Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	enum	Type of turn detection. Possible values: `server_vad`, `semantic_vad`	No
voice	VoiceIdsShared		No

RealtimeSessionCreateResponse

A new Realtime session configuration, with an ephemeral key. Default TTL for keys is one minute.

Name	Type	Description	Required
client_secret	object	Ephemeral key returned by the API.	Yes
└─ expires_at	integer	Timestamp for when the token expires. Currently, all tokens expire after one minute.	No
└─ value	string	Ephemeral key usable in client environments to authenticate connections to the Realtime API. Use this in client-side environments rather than a standard API token, which should only be used server-side.	No
input_audio_format	string	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`.	No
input_audio_transcription	object	Configuration for input audio transcription, defaults to off and can be set to `null` to turn off once on. Input audio transcription isn't native to the model, since the model consumes audio directly. Transcription runs asynchronously through Whisper and should be treated as rough guidance rather than the representation understood by the model.	No
└─ model	string	The model to use for transcription, `whisper-1` is the only currently supported model.	No
instructions	string	The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior. Note that the server sets default instructions which will be used if this field isn't set and are visible in the `session.created` event at the start of the session.	No
max_response_output_tokens	integer or string	Maximum number of output tokens for a single assistant response, inclusive of tool calls. Provide an integer between 1 and 4096 to limit output tokens, or `inf` for the maximum available tokens for a given model. Defaults to `inf`.	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
output_audio_format	string	The format of output audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`.	No
temperature	number	Sampling temperature for the model, limited to [0.6, 1.2]. Defaults to 0.8.	No
tool_choice	string	How the model chooses tools. Options are `auto`, `none`, `required`, or specify a function.	No
tools	array	Tools (functions) available to the model.	No
turn_detection	object	Configuration for turn detection. Can be set to `null` to turn off. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.	No
└─ prefix_padding_ms	integer	Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	string	Type of turn detection, only `server_vad` is currently supported.	No
voice	VoiceIdsShared		No

RealtimeTranscriptionSessionCreateRequest

Realtime transcription session object configuration.

Name	Type	Description	Required	Default
include	array	The set of items to include in the transcription. Current available items are: - `item.input_audio_transcription.logprobs`	No
input_audio_format	enum	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`. For `pcm16`, input audio must be 16-bit PCM at a 24kHz sample rate, single channel (mono), and little-endian byte order. Possible values: `pcm16`, `g711_ulaw`, `g711_alaw`	No
input_audio_noise_reduction	object	Configuration for input audio noise reduction. This can be set to `null` to turn off. Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model. Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.	No
└─ type	enum	Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones. Possible values: `near_field`, `far_field`	No
input_audio_transcription	object	Configuration for input audio transcription. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.	No
└─ language	string	The language of the input audio. Supplying the input language in ISO-639-1 (e.g. `en`) format will improve accuracy and latency.	No
└─ model	enum	The model to use for transcription, current options are `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, and `whisper-1`. Possible values: `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, `whisper-1`	No
└─ prompt	string	An optional text to guide the model's style or continue a previous audio segment. For `whisper-1`, the prompt is a list of keywords. For `gpt-4o-transcribe` models, the prompt is a free text string, for example "expect words related to technology".	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
turn_detection	object	Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech. Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with `uhhm`, the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.	No
└─ create_response	boolean	Whether or not to automatically generate a response when a VAD stop event occurs. Not available for transcription sessions.	No	True
└─ eagerness	enum	Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. Possible values: `low`, `medium`, `high`, `auto`	No
└─ interrupt_response	boolean	Whether or not to automatically interrupt any ongoing response with output to the default conversation (i.e. `conversation` of `auto`) when a VAD start event occurs. Not available for transcription sessions.	No	True
└─ prefix_padding_ms	integer	Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	enum	Type of turn detection. Possible values: `server_vad`, `semantic_vad`	No

RealtimeTranscriptionSessionCreateResponse

A new Realtime transcription session configuration.

When a session is created on the server via REST API, the session object also contains an ephemeral key. Default TTL for keys is one minute. This property isn't present when a session is updated via the WebSocket API.

Name	Type	Description	Required
client_secret	object	Ephemeral key returned by the API. Only present when the session is created on the server via REST API.	Yes
└─ expires_at	integer	Timestamp for when the token expires. Currently, all tokens expire after one minute.	No
└─ value	string	Ephemeral key usable in client environments to authenticate connections to the Realtime API. Use this in client-side environments rather than a standard API token, which should only be used server-side.	No
input_audio_format	string	The format of input audio. Options are `pcm16`, `g711_ulaw`, or `g711_alaw`.	No
input_audio_transcription	object	Configuration of the transcription model.	No
└─ language	string	The language of the input audio. Supplying the input language in ISO-639-1 (e.g. `en`) format will improve accuracy and latency.	No
└─ model	enum	The model to use for transcription. Can be `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, or `whisper-1`. Possible values: `gpt-4o-transcribe`, `gpt-4o-mini-transcribe`, `whisper-1`	No
└─ prompt	string	An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.	No
modalities		The set of modalities the model can respond with. To disable audio, set this to ["text"].	No
turn_detection	object	Configuration for turn detection. Can be set to `null` to turn off. Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.	No
└─ prefix_padding_ms	integer	Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.	No
└─ silence_duration_ms	integer	Duration of silence to detect speech stop (in milliseconds). Defaults to 500ms. With shorter values the model will respond more quickly, but may jump in on short pauses from the user.	No
└─ threshold	number	Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A higher threshold will require louder audio to activate the model, and thus might perform better in noisy environments.	No
└─ type	string	Type of turn detection, only `server_vad` is currently supported.	No

Reasoning

o-series models only

Configuration options for reasoning models.

Name	Type	Description	Required	Default
effort	ReasoningEffort	o-series models only Constrains effort on reasoning for reasoning models. Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.	Yes	medium
summary	enum	o-series models only A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise` or `detailed`. Possible values: `concise`, `detailed`	No

ReasoningItem

A description of the chain of thought used by a reasoning model while generating a response.

Name	Type	Description	Required
content	array	Reasoning text contents.	Yes
id	string	The unique identifier of the reasoning content.	Yes
status	enum	The status of the item. One of `in_progress`, `completed`, or `incomplete`. Populated when items are returned via API. Possible values: `in_progress`, `completed`, `incomplete`	No
type	enum	The type of the object. Always `reasoning`. Possible values: `reasoning`	Yes

Refusal

A refusal from the model.

Name	Type	Description	Required	Default
refusal	string	The refusal explanation from the model.	Yes
type	enum	The type of the refusal. Always `refusal`. Possible values: `refusal`	Yes

response

Name	Type	Description	Required	Default
created_at	number	Unix timestamp (in seconds) of when this Response was created.	Yes
error	ResponseError	An error object returned when the model fails to generate a Response.	Yes
id	string	Unique identifier for this Response.	Yes
incomplete_details	object	Details about why the response is incomplete.	Yes
└─ reason	enum	The reason why the response is incomplete. Possible values: `max_output_tokens`, `content_filter`	No
instructions	string	Inserts a system (or developer) message as the first item in the model's context. When using along with `previous_response_id`, the instructions from a previous response will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses.	Yes
max_output_tokens	integer	An upper bound for the number of tokens that can be generated for a response, including visible output tokens and conversation state.	No
metadata	Metadata	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.	Yes
model	string	Model used to generate the responses.	Yes
object	enum	The object type of this resource - always set to `response`. Possible values: `response`	Yes
output	array	An array of content items generated by the model. - The length and order of items in the `output` array is dependent on the model's response. - Rather than accessing the first item in the `output` array and assuming it's an `assistant` message with the content generated by the model, you might consider using the `output_text` property where supported in SDKs.	Yes
output_text	string	SDK-only convenience property that contains the aggregated text output from all `output_text` items in the `output` array, if any are present. Supported in the Python and JavaScript SDKs.	No
parallel_tool_calls	boolean	Whether to allow the model to run tool calls in parallel.	Yes	True
previous_response_id	string	The unique ID of the previous response to the model. Use this to create multi-turn conversations.	No
reasoning	Reasoning	o-series models only Configuration options for reasoning models.	No
status	enum	The status of the response generation. One of `completed`, `failed`, `in_progress`, or `incomplete`. Possible values: `completed`, `failed`, `in_progress`, `incomplete`	No
temperature	number	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.	Yes	1
text	object	Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - text inputs and outputs - Structured Outputs	No
└─ format	TextResponseFormatConfiguration	An object specifying the format that the model must output. Configuring `{ "type": "json_schema" }` enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is `{ "type": "text" }` with no additional options. Not recommended for gpt-4o and newer models: Setting to `{ "type": "json_object" }` enables the older JSON mode, which ensures the message the model generates is valid JSON. Using `json_schema` is preferred for models that support it.	No
tool_choice	ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction	How the model should select which tool (or tools) to use when generating a response. See the `tools` parameter to see how to specify which tools the model can call.	Yes
tools	array	An array of tools the model may call while generating a response. You can specify which tool to use by setting the `tool_choice` parameter. The two categories of tools you can provide the model are: - Built-in tools	Yes
top_p	number	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature` but not both.	Yes	1
truncation	enum	The truncation strategy to use for the model response. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: `auto`, `disabled`	No
usage	ResponseUsage	Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.	No
user	string	A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. .	No

ResponseAudioDeltaEvent

Emitted when there is a partial audio response.

Name	Type	Description	Required	Default
delta	string	A chunk of Base64 encoded response audio bytes.	Yes
type	enum	The type of the event. Always `response.audio.delta`. Possible values: `response.audio.delta`	Yes

ResponseAudioDoneEvent

Emitted when the audio response is complete.

Name	Type	Description	Required	Default
type	enum	The type of the event. Always `response.audio.done`. Possible values: `response.audio.done`	Yes

ResponseAudioTranscriptDeltaEvent

Emitted when there is a partial transcript of audio.

Name	Type	Description	Required	Default
delta	string	The partial transcript of the audio response.	Yes
type	enum	The type of the event. Always `response.audio.transcript.delta`. Possible values: `response.audio.transcript.delta`	Yes

ResponseAudioTranscriptDoneEvent

Emitted when the full audio transcript is completed.

Name	Type	Description	Required	Default
type	enum	The type of the event. Always `response.audio.transcript.done`. Possible values: `response.audio.transcript.done`	Yes

ResponseCodeInterpreterCallCodeDeltaEvent

Emitted when a partial code snippet is added by the code interpreter.

Name	Type	Description	Required
delta	string	The partial code snippet added by the code interpreter.	Yes
output_index	integer	The index of the output item that the code interpreter call is in progress.	Yes
type	enum	The type of the event. Always `response.code_interpreter_call.code.delta`. Possible values: `response.code_interpreter_call.code.delta`	Yes

ResponseCodeInterpreterCallCodeDoneEvent

Emitted when code snippet output is finalized by the code interpreter.

Name	Type	Description	Required
code	string	The final code snippet output by the code interpreter.	Yes
output_index	integer	The index of the output item that the code interpreter call is in progress.	Yes
type	enum	The type of the event. Always `response.code_interpreter_call.code.done`. Possible values: `response.code_interpreter_call.code.done`	Yes

ResponseCodeInterpreterCallCompletedEvent

Emitted when the code interpreter call is completed.

Name	Type	Description	Required
code_interpreter_call	CodeInterpreterToolCall	A tool call to run code.	Yes
output_index	integer	The index of the output item that the code interpreter call is in progress.	Yes
type	enum	The type of the event. Always `response.code_interpreter_call.completed`. Possible values: `response.code_interpreter_call.completed`	Yes

ResponseCodeInterpreterCallInProgressEvent

Emitted when a code interpreter call is in progress.

Name	Type	Description	Required
code_interpreter_call	CodeInterpreterToolCall	A tool call to run code.	Yes
output_index	integer	The index of the output item that the code interpreter call is in progress.	Yes
type	enum	The type of the event. Always `response.code_interpreter_call.in_progress`. Possible values: `response.code_interpreter_call.in_progress`	Yes

ResponseCodeInterpreterCallInterpretingEvent

Emitted when the code interpreter is actively interpreting the code snippet.

Name	Type	Description	Required
code_interpreter_call	CodeInterpreterToolCall	A tool call to run code.	Yes
output_index	integer	The index of the output item that the code interpreter call is in progress.	Yes
type	enum	The type of the event. Always `response.code_interpreter_call.interpreting`. Possible values: `response.code_interpreter_call.interpreting`	Yes

ResponseCompletedEvent

Emitted when the model response is complete.

Name	Type	Description	Required	Default
response	response		Yes
type	enum	The type of the event. Always `response.completed`. Possible values: `response.completed`	Yes

ResponseContentPartAddedEvent

Emitted when a new content part is added.

Name	Type	Description	Required
content_index	integer	The index of the content part that was added.	Yes
item_id	string	The ID of the output item that the content part was added to.	Yes
output_index	integer	The index of the output item that the content part was added to.	Yes
part	OutputContent		Yes
type	enum	The type of the event. Always `response.content_part.added`. Possible values: `response.content_part.added`	Yes

ResponseContentPartDoneEvent

Emitted when a content part is done.

Name	Type	Description	Required
content_index	integer	The index of the content part that is done.	Yes
item_id	string	The ID of the output item that the content part was added to.	Yes
output_index	integer	The index of the output item that the content part was added to.	Yes
part	OutputContent		Yes
type	enum	The type of the event. Always `response.content_part.done`. Possible values: `response.content_part.done`	Yes

ResponseCreatedEvent

An event that is emitted when a response is created.

Name	Type	Description	Required	Default
response	response		Yes
type	enum	The type of the event. Always `response.created`. Possible values: `response.created`	Yes

ResponseError

An error object returned when the model fails to generate a Response.

Name	Type	Description	Required	Default
code	ResponseErrorCode	The error code for the response.	Yes
message	string	A human-readable description of the error.	Yes

ResponseErrorCode

The error code for the response.

Property	Value
Description	The error code for the response.
Type	string
Values	`server_error` `rate_limit_exceeded` `invalid_prompt` `vector_store_timeout` `invalid_image` `invalid_image_format` `invalid_base64_image` `invalid_image_url` `image_too_large` `image_too_small` `image_parse_error` `image_content_policy_violation` `invalid_image_mode` `image_file_too_large` `unsupported_image_media_type` `empty_image_file` `failed_to_download_image` `image_file_not_found`

ResponseErrorEvent

Emitted when an error occurs.

Name	Type	Description	Required
code	string	The error code.	Yes
message	string	The error message.	Yes
param	string	The error parameter.	Yes
type	enum	The type of the event. Always `error`. Possible values: `error`	Yes

ResponseFailedEvent

An event that is emitted when a response fails.

Name	Type	Description	Required	Default
response	response		Yes
type	enum	The type of the event. Always `response.failed`. Possible values: `response.failed`	Yes

ResponseFileSearchCallCompletedEvent

Emitted when a file search call is completed (results found).

Name	Type	Description	Required
item_id	string	The ID of the output item that the file search call is initiated.	Yes
output_index	integer	The index of the output item that the file search call is initiated.	Yes
type	enum	The type of the event. Always `response.file_search_call.completed`. Possible values: `response.file_search_call.completed`	Yes

ResponseFileSearchCallInProgressEvent

Emitted when a file search call is initiated.

Name	Type	Description	Required
item_id	string	The ID of the output item that the file search call is initiated.	Yes
output_index	integer	The index of the output item that the file search call is initiated.	Yes
type	enum	The type of the event. Always `response.file_search_call.in_progress`. Possible values: `response.file_search_call.in_progress`	Yes

ResponseFileSearchCallSearchingEvent

Emitted when a file search is currently searching.

Name	Type	Description	Required
item_id	string	The ID of the output item that the file search call is initiated.	Yes
output_index	integer	The index of the output item that the file search call is searching.	Yes
type	enum	The type of the event. Always `response.file_search_call.searching`. Possible values: `response.file_search_call.searching`	Yes

ResponseFunctionCallArgumentsDeltaEvent

Emitted when there is a partial function-call arguments delta.

Name	Type	Description	Required
delta	string	The function-call arguments delta that is added.	Yes
item_id	string	The ID of the output item that the function-call arguments delta is added to.	Yes
output_index	integer	The index of the output item that the function-call arguments delta is added to.	Yes
type	enum	The type of the event. Always `response.function_call_arguments.delta`. Possible values: `response.function_call_arguments.delta`	Yes

ResponseFunctionCallArgumentsDoneEvent

Emitted when function-call arguments are finalized.

Name	Type	Description	Required
arguments	string	The function-call arguments.	Yes
item_id	string	The ID of the item.	Yes
output_index	integer	The index of the output item.	Yes
type	enum	Possible values: `response.function_call_arguments.done`	Yes

ResponseInProgressEvent

Emitted when the response is in progress.

Name	Type	Description	Required	Default
response	response		Yes
type	enum	The type of the event. Always `response.in_progress`. Possible values: `response.in_progress`	Yes

ResponseIncompleteEvent

An event that is emitted when a response finishes as incomplete.

Name	Type	Description	Required	Default
response	response		Yes
type	enum	The type of the event. Always `response.incomplete`. Possible values: `response.incomplete`	Yes

responseItemList

A list of Response items.

Name	Type	Description	Required
data	array	A list of items used to generate this response.	Yes
first_id	string	The ID of the first item in the list.	Yes
has_more	boolean	Whether there are more items available.	Yes
last_id	string	The ID of the last item in the list.	Yes
object	enum	The type of object returned, must be `list`. Possible values: `list`	Yes

ResponseModalities

Output types that you would like the model to generate. Most models are capable of generating text, which is the default:

["text"]

The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use:

["text", "audio"]

No properties defined for this component.

ResponseModalitiesTextOnly

Output types that you would like the model to generate. Most models are capable of generating text, which is the default:

["text"]

This API will soon support other output modalities, including audio and images.

No properties defined for this component.

ResponseOutputItemAddedEvent

Emitted when a new output item is added.

Name	Type	Description	Required
item	OutputItem		Yes
output_index	integer	The index of the output item that was added.	Yes
type	enum	The type of the event. Always `response.output_item.added`. Possible values: `response.output_item.added`	Yes

ResponseOutputItemDoneEvent

Emitted when an output item is marked done.

Name	Type	Description	Required
item	OutputItem		Yes
output_index	integer	The index of the output item that was marked done.	Yes
type	enum	The type of the event. Always `response.output_item.done`. Possible values: `response.output_item.done`	Yes

ResponseProperties

Name	Type	Description	Required
instructions	string	Inserts a system (or developer) message as the first item in the model's context. When using along with `previous_response_id`, the instructions from a previous response will be not be carried over to the next response. This makes it simple to swap out system (or developer) messages in new responses.	No
max_output_tokens	integer	An upper bound for the number of tokens that can be generated for a response, including visible output tokens and conversation state.	No
previous_response_id	string	The unique ID of the previous response to the model. Use this to create multi-turn conversations.	No
reasoning	Reasoning	o-series models only Configuration options for reasoning models.	No
text	object	Configuration options for a text response from the model. Can be plain text or structured JSON data. Learn more: - text inputs and outputs - Structured Outputs	No
└─ format	TextResponseFormatConfiguration	An object specifying the format that the model must output. Configuring `{ "type": "json_schema" }` enables Structured Outputs, which ensures the model will match your supplied JSON schema. The default format is `{ "type": "text" }` with no additional options. Not recommended for gpt-4o and newer models: Setting to `{ "type": "json_object" }` enables the older JSON mode, which ensures the message the model generates is valid JSON. Using `json_schema` is preferred for models that support it.	No
tool_choice	ToolChoiceOptions or ToolChoiceTypes or ToolChoiceFunction	How the model should select which tool (or tools) to use when generating a response. See the `tools` parameter to see how to specify which tools the model can call.	No
tools	array	An array of tools the model may call while generating a response. You can specify which tool to use by setting the `tool_choice` parameter. The two categories of tools you can provide the model are: - Built-in tools	No
truncation	enum	The truncation strategy to use for the model response. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. Possible values: `auto`, `disabled`	No

ResponseRefusalDeltaEvent

Emitted when there is a partial refusal text.

Name	Type	Description	Required
content_index	integer	The index of the content part that the refusal text is added to.	Yes
delta	string	The refusal text that is added.	Yes
item_id	string	The ID of the output item that the refusal text is added to.	Yes
output_index	integer	The index of the output item that the refusal text is added to.	Yes
type	enum	The type of the event. Always `response.refusal.delta`. Possible values: `response.refusal.delta`	Yes

ResponseRefusalDoneEvent

Emitted when refusal text is finalized.

Name	Type	Description	Required
content_index	integer	The index of the content part that the refusal text is finalized.	Yes
item_id	string	The ID of the output item that the refusal text is finalized.	Yes
output_index	integer	The index of the output item that the refusal text is finalized.	Yes
refusal	string	The refusal text that is finalized.	Yes
type	enum	The type of the event. Always `response.refusal.done`. Possible values: `response.refusal.done`	Yes

responseStreamEvent

Name	Type	Description	Required
annotation	Annotation		Yes
annotation_index	integer	The index of the annotation that was added.	Yes
arguments	string	The function-call arguments.	Yes
code	string	The error code.	Yes
code_interpreter_call	CodeInterpreterToolCall	A tool call to run code.	Yes
content_index	integer	The index of the content part that the text content is finalized.	Yes
delta	string	The text delta that was added.	Yes
item	OutputItem	The output item that was marked done.	Yes
item_id	string	The ID of the output item that the text content is finalized.	Yes
message	string	The error message.	Yes
output_index	integer	The index of the output item that the text content is finalized.	Yes
param	string	The error parameter.	Yes
part	OutputContent	The content part that is done.	Yes
refusal	string	The refusal text that is finalized.	Yes
response	response	The response that was incomplete.	Yes
text	string	The text content that is finalized.	Yes
type	enum	The type of the event. Always `response.output_text.done`. Possible values: `response.output_text.done`	Yes

ResponseTextAnnotationDeltaEvent

Emitted when a text annotation is added.

Name	Type	Description	Required
annotation	Annotation		Yes
annotation_index	integer	The index of the annotation that was added.	Yes
content_index	integer	The index of the content part that the text annotation was added to.	Yes
item_id	string	The ID of the output item that the text annotation was added to.	Yes
output_index	integer	The index of the output item that the text annotation was added to.	Yes
type	enum	The type of the event. Always `response.output_text.annotation.added`. Possible values: `response.output_text.annotation.added`	Yes

ResponseTextDeltaEvent

Emitted when there is an additional text delta.

Name	Type	Description	Required
content_index	integer	The index of the content part that the text delta was added to.	Yes
delta	string	The text delta that was added.	Yes
item_id	string	The ID of the output item that the text delta was added to.	Yes
output_index	integer	The index of the output item that the text delta was added to.	Yes
type	enum	The type of the event. Always `response.output_text.delta`. Possible values: `response.output_text.delta`	Yes

ResponseTextDoneEvent

Emitted when text content is finalized.

Name	Type	Description	Required
content_index	integer	The index of the content part that the text content is finalized.	Yes
item_id	string	The ID of the output item that the text content is finalized.	Yes
output_index	integer	The index of the output item that the text content is finalized.	Yes
text	string	The text content that is finalized.	Yes
type	enum	The type of the event. Always `response.output_text.done`. Possible values: `response.output_text.done`	Yes

ResponseUsage

Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.

Name	Type	Description	Required
input_tokens	integer	The number of input tokens.	Yes
output_tokens	integer	The number of output tokens.	Yes
output_tokens_details	object	A detailed breakdown of the output tokens.	Yes
└─ reasoning_tokens	integer	The number of reasoning tokens.	No
total_tokens	integer	The total number of tokens used.	Yes

Screenshot

A screenshot action.

Name	Type	Description	Required	Default
type	enum	Specifies the event type. For a screenshot action, this property is always set to `screenshot`. Possible values: `screenshot`	Yes

Scroll

A scroll action.

Name	Type	Description	Required
scroll_x	integer	The horizontal scroll distance.	Yes
scroll_y	integer	The vertical scroll distance.	Yes
type	enum	Specifies the event type. For a scroll action, this property is always set to `scroll`. Possible values: `scroll`	Yes
x	integer	The x-coordinate where the scroll occurred.	Yes
y	integer	The y-coordinate where the scroll occurred.	Yes

StopConfiguration

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

This component can be one of the following:

TextResponseFormatConfiguration

An object specifying the format that the model must output.

Configuring { "type": "json_schema" } enables Structured Outputs, which ensure the model will match your supplied JSON schema.

The default format is { "type": "text" } with no additional options.

Not recommended for gpt-4o and newer models:

Setting to { "type": "json_object" } enables the older JSON mode, which ensures the message the model generates is valid JSON. Using json_schema is preferred for models that support it.

This component can be one of the following:

ResponseFormatText
TextResponseFormatJsonSchema
ResponseFormatJsonObject

TextResponseFormatJsonSchema

JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.

Name	Type	Description	Required	Default
description	string	A description of what the response format is for, used by the model to determine how to respond in the format.	No
name	string	The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.	No
schema	ResponseFormatJsonSchemaSchema	The schema for the response format, described as a JSON Schema object.	Yes
strict	boolean	Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the `schema` field. Only a subset of JSON Schema is supported when `strict` is `true`.	No	False
type	enum	The type of response format being defined. Always `json_schema`. Possible values: `json_schema`	Yes

Tool

This component can be one of the following:

FileSearchTool
FunctionTool
ComputerTool

ToolChoiceFunction

Use this option to force the model to call a specific function.

Name	Type	Description	Required	Default
name	string	The name of the function to call.	Yes
type	enum	For function calling, the type is always `function`. Possible values: `function`	Yes

ToolChoiceOptions

Controls which (if any) tool is called by the model.

none means the model will not call any tool and instead generates a message.

auto means the model can pick between generating a message or calling one or more tools.

required means the model must call one or more tools.

Property	Value
Description	Controls which (if any) tool is called by the model. `none` means the model will not call any tool and instead generates a message. `auto` means the model can pick between generating a message or calling one or more tools. `required` means the model must call one or more tools.
Type	string
Values	`none` `auto` `required`

ToolChoiceTypes

Indicates that the model should use a built-in tool to generate a response.

Name	Type	Description	Required	Default
type	enum	The type of hosted tool the model should use. Allowed values are: - `file_search` - `computer_use_preview` Possible values: `file_search`, `computer_use_preview`	Yes

Type

An action to type in text.

Name	Type	Description	Required	Default
text	string	The text to type.	Yes
type	enum	Specifies the event type. For a type action, this property is always set to `type`. Possible values: `type`	Yes

UpdateVectorStoreFileAttributesRequest

Name	Type	Description	Required	Default
attributes	VectorStoreFileAttributes	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.	Yes

UrlCitation

A citation for a web resource used to generate a model response.

Name	Type	Description	Required
end_index	integer	The index of the last character of the URL citation in the message.	Yes
start_index	integer	The index of the first character of the URL citation in the message.	Yes
title	string	The title of the web resource.	Yes
type	enum	The type of the URL citation. Always `url_citation`. Possible values: `url_citation`	Yes
url	string	The URL of the web resource.	Yes

VectorStoreFileAttributes

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.

No properties defined for this component.

VectorStoreFileContentResponse

Represents the parsed content of a vector store file.

Name	Type	Description	Required
data	array	Parsed content of the file.	Yes
has_more	boolean	Indicates if there are more content pages to fetch.	Yes
next_page	string	The token for the next page, if any.	Yes
object	enum	The object type, which is always `vector_store.file_content.page` Possible values: `vector_store.file_content.page`	Yes

VectorStoreSearchRequest

Name	Type	Description	Required	Default
filters	ComparisonFilter or CompoundFilter	A filter to apply based on file attributes.	No
max_num_results	integer	The maximum number of results to return. This number should be between 1 and 50 inclusive.	No	10
query	string or array	A query string for a search	Yes
ranking_options	object	Ranking options for search.	No
└─ ranker	enum	Possible values: `auto`, `default-2024-11-15`	No
└─ score_threshold	number		No	0
rewrite_query	boolean	Whether to rewrite the natural language query for vector search.	No	False

VectorStoreSearchResultContentObject

Name	Type	Description	Required	Default
text	string	The text content returned from search.	Yes
type	enum	The type of content. Possible values: `text`	Yes

VectorStoreSearchResultItem

Name	Type	Description	Required
attributes	VectorStoreFileAttributes	Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters, booleans, or numbers.	Yes
content	array	Content chunks from the file.	Yes
file_id	string	The ID of the vector store file.	Yes
filename	string	The name of the vector store file.	Yes
score	number	The similarity score for the result.	Yes

VectorStoreSearchResultsPage

Name	Type	Description	Required
data	array	The list of search result items.	Yes
has_more	boolean	Indicates if there are more results to fetch.	Yes
next_page	string	The token for the next page, if any.	Yes
object	enum	The object type, which is always `vector_store.search_results.page` Possible values: `vector_store.search_results.page`	Yes
search_query	array		Yes

VoiceIdsShared

No properties defined for this component.

Wait

A wait action.

Name	Type	Description	Required	Default
type	enum	Specifies the event type. For a wait action, this property is always set to `wait`. Possible values: `wait`	Yes

ReasoningEffort

o-series models only

Constrains effort on reasoning for reasoning models. Currently supported values are low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.

Property	Value
Description	o-series models only Constrains effort on reasoning for reasoning models. Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Type	string
Default	medium
Values	`low` `medium` `high`

errorEvent

Occurs when an error occurs. This can happen due to an internal server error or a timeout.

Name	Type	Description	Required	Default
data	error		Yes
event	string		Yes

event Enum: ErrorEventEnum

Value	Description
error

doneEvent

Occurs when a stream ends.

Name	Type	Description	Required	Default
data	string		Yes
event	string		Yes

event Enum: DoneEventEnum

Value	Description
done

data Enum: DoneEventDataEnum

Value	Description
[DONE]

Next steps

Learn about Models, and fine-tuning with the REST API. Learn more about the underlying models that power Azure OpenAI.

Share via

Azure OpenAI in Azure AI Foundry Models REST API preview reference

API specs

Authentication

REST API versioning

Data plane inference

Completions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Embeddings - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Chat completions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Example

Example

Example

Example

Example

Example

Transcriptions - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Translations - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Example

Speech - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Image generations - Create

URI Parameters

Request Header

Request Body

Responses

Examples

Example

Image generations - Edit

URI Parameters

Request Header

Request Body

Responses

List - Assistants

URI Parameters

Request Header

Responses

Examples

Example

Create - Assistant

URI Parameters

Request Header

Request Body

Responses