LLM tool

The large language model (LLM) tool in prompt flow enables you to take advantage of widely used large language models like OpenAI, or Azure OpenAI Service, or any language model supported by the Azure AI model inference API for natural language processing.

Prompt flow provides a few different large language model APIs:

  • Completion: OpenAI's completion models generate text based on provided prompts.
  • Chat: OpenAI's chat models and the Azure AI chat models facilitate interactive conversations with text-based inputs and responses.

Note

We removed the embedding option from the LLM tool API setting. You can use an embedding API with the embedding tool. Only key-based authentication is supported for Azure OpenAI connection. Please don't use non-ascii characters in resource group name of Azure OpenAI resource, prompt flow didn't support this case.

Prerequisites

Create OpenAI resources:

Connections

Set up connections to provisioned resources in prompt flow.

Type Name API key API type API version
OpenAI Required Required - -
Azure OpenAI - API key Required Required Required Required
Azure OpenAI - Microsoft Entra ID Required - - Required
Serverless model Required Required - -

Tip

Inputs

The following sections show various inputs.

Text completion

Name Type Description Required
prompt string Text prompt for the language model. Yes
model, deployment_name string Language model to use. Yes
max_tokens integer Maximum number of tokens to generate in the completion. Default is 16. No
temperature float Randomness of the generated text. Default is 1. No
stop list Stopping sequence for the generated text. Default is null. No
suffix string Text appended to the end of the completion. No
top_p float Probability of using the top choice from the generated tokens. Default is 1. No
logprobs integer Number of log probabilities to generate. Default is null. No
echo boolean Value that indicates whether to echo back the prompt in the response. Default is false. No
presence_penalty float Value that controls the model's behavior for repeating phrases. Default is 0. No
frequency_penalty float Value that controls the model's behavior for generating rare phrases. Default is 0. No
best_of integer Number of best completions to generate. Default is 1. No
logit_bias dictionary Logit bias for the language model. Default is an empty dictionary. No

Chat

Name Type Description Required
prompt string Text prompt that the language model uses for a response. Yes
model, deployment_name string Language model to use. This parameter is not required if the model is deployed to a serverless API endpoint. Yes*
max_tokens integer Maximum number of tokens to generate in the response. Default is inf. No
temperature float Randomness of the generated text. Default is 1. No
stop list Stopping sequence for the generated text. Default is null. No
top_p float Probability of using the top choice from the generated tokens. Default is 1. No
presence_penalty float Value that controls the model's behavior for repeating phrases. Default is 0. No
frequency_penalty float Value that controls the model's behavior for generating rare phrases. Default is 0. No
logit_bias dictionary Logit bias for the language model. Default is an empty dictionary. No

Outputs

API Return type Description
Completion string Text of one predicted completion
Chat string Text of one response of conversation

Use the LLM tool

  1. Set up and select the connections to OpenAI resources or to a serverless API endpoint.
  2. Configure the large language model API and its parameters.
  3. Prepare the prompt with guidance.