Speech to text REST API
Speech to text REST API is used for Batch transcription and Custom Speech.
Important
Speech to text REST API v3.1 is generally available. Version 3.0 of the Speech to text REST API will be retired. For more information, see the Migrate code from v3.0 to v3.1 of the REST API guide.
Use Speech to text REST API to:
- Custom Speech: With Custom Speech, you can upload your own data, test and train a custom model, compare accuracy between models, and deploy a model to a custom endpoint. Copy models to other subscriptions if you want colleagues to have access to a model that you built, or if you want to deploy a model to more than one region.
- Batch transcription: Transcribe audio files as a batch from multiple URLs or an Azure container.
Speech to text REST API includes such features as:
- Get logs for each endpoint if logs have been requested for that endpoint.
- Request the manifest of the models that you create, to set up on-premises containers.
- Upload data from Azure storage accounts by using a shared access signature (SAS) URI.
- Bring your own storage. Use your own storage accounts for logs, transcription files, and other data.
- Some operations support webhook notifications. You can register your webhooks where notifications are sent.
Datasets
Datasets are applicable for Custom Speech. You can use datasets to train and test the performance of different models. For example, you can compare the performance of a model trained with a specific dataset to the performance of a model trained with a different dataset.
See Upload training and testing datasets for examples of how to upload datasets. This table includes all the operations that you can perform on datasets.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/datasets |
GET | Datasets_List | GetDatasets |
/datasets |
POST | Datasets_Create | CreateDataset |
/datasets/{id} |
DELETE | Datasets_Delete | DeleteDataset |
/datasets/{id} |
GET | Datasets_Get | GetDataset |
/datasets/{id} |
PATCH | Datasets_Update | UpdateDataset |
/datasets/{id}/blocks:commit |
POST | Datasets_CommitBlocks | Not applicable |
/datasets/{id}/blocks |
GET | Datasets_GetBlocks | Not applicable |
/datasets/{id}/blocks |
PUT | Datasets_UploadBlock | Not applicable |
/datasets/{id}/files |
GET | Datasets_ListFiles | GetDatasetFiles |
/datasets/{id}/files/{fileId} |
GET | Datasets_GetFile | GetDatasetFile |
/datasets/locales |
GET | Datasets_ListSupportedLocales | GetSupportedLocalesForDatasets |
/datasets/upload |
POST | Datasets_Upload | UploadDatasetFromForm |
Endpoints
Endpoints are applicable for Custom Speech. You must deploy a custom endpoint to use a Custom Speech model.
See Deploy a model for examples of how to manage deployment endpoints. This table includes all the operations that you can perform on endpoints.
Evaluations
Evaluations are applicable for Custom Speech. You can use evaluations to compare the performance of different models. For example, you can compare the performance of a model trained with a specific dataset to the performance of a model trained with a different dataset.
See Test recognition quality and Test accuracy for examples of how to test and evaluate Custom Speech models. This table includes all the operations that you can perform on evaluations.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/evaluations |
GET | Evaluations_List | GetEvaluations |
/evaluations |
POST | Evaluations_Create | CreateEvaluation |
/evaluations/{id} |
DELETE | Evaluations_Delete | DeleteEvaluation |
/evaluations/{id} |
GET | Evaluations_Get | GetEvaluation |
/evaluations/{id} |
PATCH | Evaluations_Update | UpdateEvaluation |
/evaluations/{id}/files |
GET | Evaluations_ListFiles | GetEvaluationFiles |
/evaluations/{id}/files/{fileId} |
GET | Evaluations_GetFile | GetEvaluationFile |
/evaluations/locales |
GET | Evaluations_ListSupportedLocales | GetSupportedLocalesForEvaluations |
Health status
Health status provides insights about the overall health of the service and sub-components.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/healthstatus |
GET | HealthStatus_Get | GetHealthStatus |
Models
Models are applicable for Custom Speech and Batch Transcription. You can use models to transcribe audio files. For example, you can use a model trained with a specific dataset to transcribe audio files.
See Train a model and Custom Speech model lifecycle for examples of how to train and manage Custom Speech models. This table includes all the operations that you can perform on models.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/models |
GET | Models_ListCustomModels | GetModels |
/models |
POST | Models_Create | CreateModel |
/models/{id}:copyto 1 |
POST | Models_CopyTo | CopyModelToSubscription |
/models/{id} |
DELETE | Models_Delete | DeleteModel |
/models/{id} |
GET | Models_GetCustomModel | GetModel |
/models/{id} |
PATCH | Models_Update | UpdateModel |
/models/{id}/files |
GET | Models_ListFiles | Not applicable |
/models/{id}/files/{fileId} |
GET | Models_GetFile | Not applicable |
/models/{id}/manifest |
GET | Models_GetCustomModelManifest | GetModelManifest |
/models/base |
GET | Models_ListBaseModels | GetBaseModels |
/models/base/{id} |
GET | Models_GetBaseModel | GetBaseModel |
/models/base/{id}/manifest |
GET | Models_GetBaseModelManifest | GetBaseModelManifest |
/models/locales |
GET | Models_ListSupportedLocales | GetSupportedLocalesForModels |
Projects
Projects are applicable for Custom Speech. Custom Speech projects contain models, training and testing datasets, and deployment endpoints. Each project is specific to a locale. For example, you might create a project for English in the United States.
See Create a project for examples of how to create projects. This table includes all the operations that you can perform on projects.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/projects |
GET | Projects_List | GetProjects |
/projects |
POST | Projects_Create | CreateProject |
/projects/{id} |
DELETE | Projects_Delete | DeleteProject |
/projects/{id} |
GET | Projects_Get | GetProject |
/projects/{id} |
PATCH | Projects_Update | UpdateProject |
/projects/{id}/datasets |
GET | Projects_ListDatasets | GetDatasetsForProject |
/projects/{id}/endpoints |
GET | Projects_ListEndpoints | GetEndpointsForProject |
/projects/{id}/evaluations |
GET | Projects_ListEvaluations | GetEvaluationsForProject |
/projects/{id}/models |
GET | Projects_ListModels | GetModelsForProject |
/projects/{id}/transcriptions |
GET | Projects_ListTranscriptions | GetTranscriptionsForProject |
/projects/locales |
GET | Projects_ListSupportedLocales | GetSupportedProjectLocales |
Transcriptions
Transcriptions are applicable for Batch Transcription. Batch transcription is used to transcribe a large amount of audio in storage. You should send multiple files per request or point to an Azure Blob Storage container with the audio files to transcribe.
See Create a transcription for examples of how to create a transcription from multiple audio files. This table includes all the operations that you can perform on transcriptions.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/transcriptions |
GET | Transcriptions_List | GetTranscriptions |
/transcriptions |
POST | Transcriptions_Create | CreateTranscription |
/transcriptions/{id} |
DELETE | Transcriptions_Delete | DeleteTranscription |
/transcriptions/{id} |
GET | Transcriptions_Get | GetTranscription |
/transcriptions/{id} |
PATCH | Transcriptions_Update | UpdateTranscription |
/transcriptions/{id}/files |
GET | Transcriptions_ListFiles | GetTranscriptionFiles |
/transcriptions/{id}/files/{fileId} |
GET | Transcriptions_GetFile | GetTranscriptionFile |
/transcriptions/locales |
GET | Transcriptions_ListSupportedLocales | GetSupportedLocalesForTranscriptions |
Web hooks
Web hooks are applicable for Custom Speech and Batch Transcription. In particular, web hooks apply to datasets, endpoints, evaluations, models, and transcriptions. Web hooks can be used to receive notifications about creation, processing, completion, and deletion events.
This table includes all the web hook operations that are available with the Speech to text REST API.
Path | Method | Version 3.1 | Version 3.0 |
---|---|---|---|
/webhooks |
GET | WebHooks_List | GetHooks |
/webhooks |
POST | WebHooks_Create | CreateHook |
/webhooks/{id}:ping 1 |
POST | WebHooks_Ping | PingHook |
/webhooks/{id}:test 2 |
POST | WebHooks_Test | TestHook |
/webhooks/{id} |
DELETE | WebHooks_Delete | DeleteHook |
/webhooks/{id} |
GET | WebHooks_Get | GetHook |
/webhooks/{id} |
PATCH | WebHooks_Update | UpdateHook |
1 The /webhooks/{id}/ping
operation (includes '/') in version 3.0 is replaced by the /webhooks/{id}:ping
operation (includes ':') in version 3.1.
2 The /webhooks/{id}/test
operation (includes '/') in version 3.0 is replaced by the /webhooks/{id}:test
operation (includes ':') in version 3.1.
Next steps
Feedback
Submit and view feedback for