Fine Tunes - Create
Creates a job that fine-tunes a specified model from a given training file. Response includes details of the enqueued job including job status and hyper parameters. The name of the fine-tuned model is added to the response once complete.
POST {endpoint}/openai/fine-tunes?api-version=2022-12-01
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint
|
path | True |
string url |
Supported Cognitive Services endpoints (protocol and hostname, for example: https://aoairesource.openai.azure.com. Replace "aoairesource" with your Azure OpenAI account name). |
api-version
|
query | True |
string |
The requested API version. |
Request Header
Name | Required | Type | Description |
---|---|---|---|
api-key | True |
string |
Provide your Cognitive Services Azure OpenAI account key here. |
Request Body
Name | Required | Type | Description |
---|---|---|---|
model | True |
string |
The identifier (model-id) of the base model used for this fine-tune. |
training_file | True |
string |
The file identity (file-id) that is used for training this fine tuned model. |
batch_size |
integer |
The batch size to use for training. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets. The default value as well as the maximum value for this property are specific to a base model. |
|
classification_betas |
number[] |
The classification beta values. If this is provided, we calculate F-beta scores at the specified beta values. The F-beta score is a generalization of F-1 score. This is only used for binary classification. With a beta of 1 (i.e.the F-1 score), precision and recall are given the same weight. A larger beta score puts more weight on recall and less on precision. A smaller beta score puts more weight on precision and less on recall. |
|
classification_n_classes |
integer |
The number of classes in a classification task. This parameter is required for multiclass classification. |
|
classification_positive_class |
string |
The positive class in binary classification. This parameter is needed to generate precision, recall, and F1 metrics when doing binary classification. |
|
compute_classification_metrics |
boolean |
A value indicating whether to compute classification metrics. If set, we calculate classification-specific metrics such as accuracy and F-1 score using the validation set at the end of every epoch. These metrics can be viewed in the results file. In order to compute classification metrics, you must provide a validation_file.Additionally, you must specify classification_n_classes for multiclass classification or classification_positive_class for binary classification. |
|
learning_rate_multiplier |
number |
The learning rate multiplier to use for training. The fine-tuning learning rate is the original learning rate used for pre-training multiplied by this value. Larger learning rates tend to perform better with larger batch sizes. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. |
|
n_epochs |
integer |
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. |
|
prompt_loss_weight |
number |
The weight to use for loss on the prompt tokens. This controls how much the model tries to learn to generate the prompt (as compared to the completion which always has a weight of 1.0), and can add a stabilizing effect to training when completions are short. If prompts are extremely long (relative to completions), it may make sense to reduce this weight so as to avoid over-prioritizing learning the prompt. |
|
suffix |
string |
The suffix used to identify the fine-tuned model. The suffix can contain up to 40 characters (a-z, A-Z, 0-9,- and _) that will be added to your fine-tuned model name. |
|
validation_file |
string |
The file identity (file-id) that is used to evaluate the fine tuned model during training. |
Responses
Name | Type | Description |
---|---|---|
201 Created |
The fine tune has been successfully created. Headers Location: string |
|
Other Status Codes |
An error occurred. |
Security
api-key
Provide your Cognitive Services Azure OpenAI account key here.
Type:
apiKey
In:
header
Examples
Creating a fine tune job for classification. |
Creating a fine tune job. |
Creating a fine tune job for classification.
Sample Request
POST https://aoairesource.openai.azure.com/openai/fine-tunes?api-version=2022-12-01
{
"compute_classification_metrics": true,
"classification_n_classes": 4,
"model": "curie",
"training_file": "file-181a1cbdcdcf4677ada87f63a0928099"
}
Sample Response
location: https://aoairesource.openai.azure.com/openai/fine-tunes/ft-72a2792ef7d24ba7b82c7fe4a37e379f
{
"hyperparams": {
"compute_classification_metrics": true,
"classification_n_classes": 4,
"batch_size": 32,
"learning_rate_multiplier": 1,
"n_epochs": 2,
"prompt_loss_weight": 0.1
},
"model": "curie",
"training_files": [
{
"statistics": {
"tokens": 42,
"examples": 23
},
"bytes": 140,
"purpose": "fine-tune",
"filename": "puppy.jsonl",
"id": "file-181a1cbdcdcf4677ada87f63a0928099",
"status": "succeeded",
"created_at": 1646126127,
"updated_at": 1646127311,
"object": "file"
}
],
"id": "ft-72a2792ef7d24ba7b82c7fe4a37e379f",
"status": "notRunning",
"created_at": 1646126127,
"updated_at": 1646127311,
"object": "fine-tune"
}
Creating a fine tune job.
Sample Request
POST https://aoairesource.openai.azure.com/openai/fine-tunes?api-version=2022-12-01
{
"model": "curie",
"training_file": "file-181a1cbdcdcf4677ada87f63a0928099"
}
Sample Response
location: https://aoairesource.openai.azure.com/openai/fine-tunes/ft-72a2792ef7d24ba7b82c7fe4a37e379f
{
"hyperparams": {
"batch_size": 32,
"learning_rate_multiplier": 1,
"n_epochs": 2,
"prompt_loss_weight": 0.1
},
"model": "curie",
"training_files": [
{
"statistics": {
"tokens": 42,
"examples": 23
},
"bytes": 140,
"purpose": "fine-tune",
"filename": "puppy.jsonl",
"id": "file-181a1cbdcdcf4677ada87f63a0928099",
"status": "succeeded",
"created_at": 1646126127,
"updated_at": 1646127311,
"object": "file"
}
],
"id": "ft-72a2792ef7d24ba7b82c7fe4a37e379f",
"status": "notRunning",
"created_at": 1646126127,
"updated_at": 1646127311,
"object": "fine-tune"
}
Definitions
Name | Description |
---|---|
Error |
Error |
Error |
ErrorCode |
Error |
ErrorResponse |
Event |
Event |
File |
File |
File |
FileStatistics |
Fine |
FineTune |
Fine |
FineTuneCreation |
Hyper |
HyperParameters |
Inner |
InnerError |
Inner |
InnerErrorCode |
Log |
LogLevel |
Purpose |
Purpose |
State |
State |
Type |
TypeDiscriminator |
Error
Error
Name | Type | Description |
---|---|---|
code |
ErrorCode |
|
details |
Error[] |
The error details if available. |
innererror |
InnerError |
|
message |
string |
The message of this error. |
target |
string |
The location where the error happened if available. |
ErrorCode
ErrorCode
Name | Type | Description |
---|---|---|
conflict |
string |
The requested operation conflicts with the current resource state. |
fileImportFailed |
string |
Import of file failed. |
forbidden |
string |
The operation is forbidden for the current user/api key. |
internalFailure |
string |
Internal error. Please retry. |
invalidPayload |
string |
The request data is invalid for this operation. |
itemDoesAlreadyExist |
string |
The item does already exist. |
jsonlValidationFailed |
string |
Validation of jsonl data failed. |
notFound |
string |
The resource is not found. |
quotaExceeded |
string |
Quota exceeded. |
serviceUnavailable |
string |
The service is currently not available. |
unexpectedEntityState |
string |
The operation cannot be executed in the current resource's state. |
ErrorResponse
ErrorResponse
Name | Type | Description |
---|---|---|
error |
Error |
Event
Event
Name | Type | Description |
---|---|---|
created_at |
integer |
A timestamp when this event was created (in unix epochs). |
level |
LogLevel |
|
message |
string |
The message describing the event. This can be a change of state, e.g., enqueued, started, failed or completed, or other events like uploaded results. |
object |
TypeDiscriminator |
File
File
Name | Type | Description |
---|---|---|
bytes |
integer |
The size of this file when available (can be null). File sizes larger than 2^53-1 are not supported to ensure compatibility with JavaScript integers. |
created_at |
integer |
A timestamp when this job or item was created (in unix epochs). |
error |
Error |
|
filename |
string |
The name of the file. |
id |
string |
The identity of this item. |
object |
TypeDiscriminator |
|
purpose |
Purpose |
|
statistics |
FileStatistics |
|
status |
State |
|
updated_at |
integer |
A timestamp when this job or item was modified last (in unix epochs). |
FileStatistics
FileStatistics
Name | Type | Description |
---|---|---|
examples |
integer |
The number of contained training examples in files of kind "fine-tune" once validation of file content is complete. |
tokens |
integer |
The number of tokens used in prompts and completions for files of kind "fine-tune" once validation of file content is complete. |
FineTune
FineTune
Name | Type | Description |
---|---|---|
created_at |
integer |
A timestamp when this job or item was created (in unix epochs). |
error |
Error |
|
events |
Event[] |
The events that show the progress of the fine-tune run including queued, running and completed. |
fine_tuned_model |
string |
The identifier (model-id) of the resulting fine tuned model. This property is only populated for successfully completed fine-tune runs. Use this identifier to create a deployment for inferencing. |
hyperparams |
HyperParameters |
|
id |
string |
The identity of this item. |
model |
string |
The identifier (model-id) of the base model used for the fine-tune. |
object |
TypeDiscriminator |
|
organisation_id |
string |
The organisation id of this fine tune job. Unused on Azure OpenAI; compatibility for OpenAI only. |
result_files |
File[] |
The result file identities (file-id) containing training and evaluation metrics in csv format. The file is only available for successfully completed fine-tune runs. |
status |
State |
|
suffix |
string |
The suffix used to identify the fine-tuned model. |
training_files |
File[] |
The file identities (file-id) that are used for training the fine tuned model. |
updated_at |
integer |
A timestamp when this job or item was modified last (in unix epochs). |
user_id |
string |
The user id of this fine tune job. Unused on Azure OpenAI; compatibility for OpenAI only. |
validation_files |
File[] |
The file identities (file-id) that are used to evaluate the fine tuned model during training. |
FineTuneCreation
FineTuneCreation
Name | Type | Description |
---|---|---|
batch_size |
integer |
The batch size to use for training. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets. The default value as well as the maximum value for this property are specific to a base model. |
classification_betas |
number[] |
The classification beta values. If this is provided, we calculate F-beta scores at the specified beta values. The F-beta score is a generalization of F-1 score. This is only used for binary classification. With a beta of 1 (i.e.the F-1 score), precision and recall are given the same weight. A larger beta score puts more weight on recall and less on precision. A smaller beta score puts more weight on precision and less on recall. |
classification_n_classes |
integer |
The number of classes in a classification task. This parameter is required for multiclass classification. |
classification_positive_class |
string |
The positive class in binary classification. This parameter is needed to generate precision, recall, and F1 metrics when doing binary classification. |
compute_classification_metrics |
boolean |
A value indicating whether to compute classification metrics. If set, we calculate classification-specific metrics such as accuracy and F-1 score using the validation set at the end of every epoch. These metrics can be viewed in the results file. In order to compute classification metrics, you must provide a validation_file.Additionally, you must specify classification_n_classes for multiclass classification or classification_positive_class for binary classification. |
learning_rate_multiplier |
number |
The learning rate multiplier to use for training. The fine-tuning learning rate is the original learning rate used for pre-training multiplied by this value. Larger learning rates tend to perform better with larger batch sizes. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. |
model |
string |
The identifier (model-id) of the base model used for this fine-tune. |
n_epochs |
integer |
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. |
prompt_loss_weight |
number |
The weight to use for loss on the prompt tokens. This controls how much the model tries to learn to generate the prompt (as compared to the completion which always has a weight of 1.0), and can add a stabilizing effect to training when completions are short. If prompts are extremely long (relative to completions), it may make sense to reduce this weight so as to avoid over-prioritizing learning the prompt. |
suffix |
string |
The suffix used to identify the fine-tuned model. The suffix can contain up to 40 characters (a-z, A-Z, 0-9,- and _) that will be added to your fine-tuned model name. |
training_file |
string |
The file identity (file-id) that is used for training this fine tuned model. |
validation_file |
string |
The file identity (file-id) that is used to evaluate the fine tuned model during training. |
HyperParameters
HyperParameters
Name | Type | Description |
---|---|---|
batch_size |
integer |
The batch size to use for training. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets. The default value as well as the maximum value for this property are specific to a base model. |
classification_betas |
number[] |
The classification beta values. If this is provided, we calculate F-beta scores at the specified beta values. The F-beta score is a generalization of F-1 score. This is only used for binary classification. With a beta of 1 (i.e.the F-1 score), precision and recall are given the same weight. A larger beta score puts more weight on recall and less on precision. A smaller beta score puts more weight on precision and less on recall. |
classification_n_classes |
integer |
The number of classes in a classification task. This parameter is required for multiclass classification. |
classification_positive_class |
string |
The positive class in binary classification. This parameter is needed to generate precision, recall, and F1 metrics when doing binary classification. |
compute_classification_metrics |
boolean |
A value indicating whether to compute classification metrics. If set, we calculate classification-specific metrics such as accuracy and F-1 score using the validation set at the end of every epoch. These metrics can be viewed in the results file. In order to compute classification metrics, you must provide a validation_file.Additionally, you must specify classification_n_classes for multiclass classification or classification_positive_class for binary classification. |
learning_rate_multiplier |
number |
The learning rate multiplier to use for training. The fine-tuning learning rate is the original learning rate used for pre-training multiplied by this value. Larger learning rates tend to perform better with larger batch sizes. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. |
n_epochs |
integer |
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. |
prompt_loss_weight |
number |
The weight to use for loss on the prompt tokens. This controls how much the model tries to learn to generate the prompt (as compared to the completion which always has a weight of 1.0), and can add a stabilizing effect to training when completions are short. If prompts are extremely long (relative to completions), it may make sense to reduce this weight so as to avoid over-prioritizing learning the prompt. |
InnerError
InnerError
Name | Type | Description |
---|---|---|
code |
InnerErrorCode |
|
innererror |
InnerError |
InnerErrorCode
InnerErrorCode
Name | Type | Description |
---|---|---|
invalidPayload |
string |
The request data is invalid for this operation. |
LogLevel
LogLevel
Name | Type | Description |
---|---|---|
error |
string |
This message represents a non recoverable issue. |
info |
string |
This event is for information only. |
warning |
string |
This event represents a mitigated issue. |
Purpose
Purpose
Name | Type | Description |
---|---|---|
fine-tune |
string |
This file contains training data for a fine tune job. |
fine-tune-results |
string |
This file contains the results of a fine tune job. |
State
State
Name | Type | Description |
---|---|---|
canceled |
string |
The operation has been canceled and is incomplete. |
deleted |
string |
The entity has been deleted but may still be referenced by other entities predating the deletion. |
failed |
string |
The operation has completed processing with a failure and cannot be further consumed. |
notRunning |
string |
The operation was created and is not queued to be processed in the future. |
running |
string |
The operation has started to be processed. |
succeeded |
string |
The operation has successfully be processed and is ready for consumption. |
TypeDiscriminator
TypeDiscriminator
Name | Type | Description |
---|---|---|
deployment |
string |
This object represents a deployment. |
file |
string |
This object represents a file. |
fine-tune |
string |
This object represents a fine tune job. |
fine-tune-event |
string |
This object represents an event of a fine tune job. |
list |
string |
This object represents a list of other objects. |
model |
string |
This object represents a model (can be a base models or fine tune job result). |