LLM tool
The large language model (LLM) tool in prompt flow enables you to take advantage of widely used large language models like OpenAI, or Azure OpenAI Service, or any language model supported by the Azure AI model inference API for natural language processing.
Prompt flow provides a few different large language model APIs:
- Completion: OpenAI's completion models generate text based on provided prompts.
- Chat: OpenAI's chat models and the Azure AI chat models facilitate interactive conversations with text-based inputs and responses.
Note
We removed the embedding
option from the LLM tool API setting. You can use an embedding API with the embedding tool.
Only key-based authentication is supported for Azure OpenAI connection.
Please don't use non-ascii characters in resource group name of Azure OpenAI resource, prompt flow didn't support this case.
Prerequisites
Create OpenAI resources:
OpenAI:
- Sign up your account on the OpenAI website.
- Sign in and find your personal API key.
Azure OpenAI:
- Create Azure OpenAI resources with these instructions.
Models deployed to Serverless API endpoints
- Create an endpoint with the model from the catalog you are interested and deploy it with a serverless API endpoint.
- To use models deployed to serverless API endpoints supported by the Azure AI model inference API, like Mistral, Cohere, Meta Llama, or Microsoft family of models (among others), you need to create a connection in your project to your endpoint
Connections
Set up connections to provisioned resources in prompt flow.
Type | Name | API key | API type | API version |
---|---|---|---|---|
OpenAI | Required | Required | - | - |
Azure OpenAI - API key | Required | Required | Required | Required |
Azure OpenAI - Microsoft Entra ID | Required | - | - | Required |
Serverless model | Required | Required | - | - |
Tip
- To use Microsoft Entra ID auth type for Azure OpenAI connection, you need assign either the
Cognitive Services OpenAI User
orCognitive Services OpenAI Contributor role
to user or user assigned managed identity. - Learn more about how to specify to use user identity to submit flow run.
- Learn more about How to configure Azure OpenAI Service with managed identities.
Inputs
The following sections show various inputs.
Text completion
Name | Type | Description | Required |
---|---|---|---|
prompt | string | Text prompt for the language model. | Yes |
model, deployment_name | string | Language model to use. | Yes |
max_tokens | integer | Maximum number of tokens to generate in the completion. Default is 16. | No |
temperature | float | Randomness of the generated text. Default is 1. | No |
stop | list | Stopping sequence for the generated text. Default is null. | No |
suffix | string | Text appended to the end of the completion. | No |
top_p | float | Probability of using the top choice from the generated tokens. Default is 1. | No |
logprobs | integer | Number of log probabilities to generate. Default is null. | No |
echo | boolean | Value that indicates whether to echo back the prompt in the response. Default is false. | No |
presence_penalty | float | Value that controls the model's behavior for repeating phrases. Default is 0. | No |
frequency_penalty | float | Value that controls the model's behavior for generating rare phrases. Default is 0. | No |
best_of | integer | Number of best completions to generate. Default is 1. | No |
logit_bias | dictionary | Logit bias for the language model. Default is an empty dictionary. | No |
Chat
Name | Type | Description | Required |
---|---|---|---|
prompt | string | Text prompt that the language model uses for a response. | Yes |
model, deployment_name | string | Language model to use. This parameter is not required if the model is deployed to a serverless API endpoint. | Yes* |
max_tokens | integer | Maximum number of tokens to generate in the response. Default is inf. | No |
temperature | float | Randomness of the generated text. Default is 1. | No |
stop | list | Stopping sequence for the generated text. Default is null. | No |
top_p | float | Probability of using the top choice from the generated tokens. Default is 1. | No |
presence_penalty | float | Value that controls the model's behavior for repeating phrases. Default is 0. | No |
frequency_penalty | float | Value that controls the model's behavior for generating rare phrases. Default is 0. | No |
logit_bias | dictionary | Logit bias for the language model. Default is an empty dictionary. | No |
Outputs
API | Return type | Description |
---|---|---|
Completion | string | Text of one predicted completion |
Chat | string | Text of one response of conversation |
Use the LLM tool
- Set up and select the connections to OpenAI resources or to a serverless API endpoint.
- Configure the large language model API and its parameters.
- Prepare the prompt with guidance.