Transcriptions - Create

Creates a new transcription.

POST {endpoint}/speechtotext/v3.2/transcriptions

URI Parameters

Name In Required Type Description
endpoint
path True

string

Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com).

Request Body

Name Required Type Description
displayName True

string

The display name of the object.

locale True

string

The locale of the contained data. If Language Identification is used, this locale is used to transcribe speech for which no language could be detected.

contentContainerUrl

string

A URL for an Azure blob container that contains the audio files. A container is allowed to have a maximum size of 5GB and a maximum number of 10000 blobs. The maximum size for a blob is 2.5GB. Container SAS should contain 'r' (read) and 'l' (list) permissions. This property will not be returned in a response.

contentUrls

string[]

A list of content urls to get audio files to transcribe. Up to 1000 urls are allowed. This property will not be returned in a response.

customProperties

object

The custom properties of this entity. The maximum allowed key length is 64 characters, the maximum allowed value length is 256 characters and the count of allowed entries is 10.

dataset

EntityReference

EntityReference

description

string

The description of the object.

model

EntityReference

EntityReference

project

EntityReference

EntityReference

properties

TranscriptionProperties

TranscriptionProperties

Responses

Name Type Description
201 Created

Transcription

The response contains information about the entity as payload and its location as header.

Headers

Location: string

Other Status Codes

Error

An error occurred.

Security

Ocp-Apim-Subscription-Key

Provide your cognitive services account key here.

Type: apiKey
In: header

Authorization

Provide an access token from the JWT returned by the STS of this region. Make sure to add the management scope to the token by adding the following query string to the STS URL: ?scope=speechservicesmanagement

Type: apiKey
In: header

Examples

Create a transcription for URIs
Create a transcription from blob container
Create a transcription with basic two-speaker diarization
Create a transcription with language identification
Create a transcription with multispeaker diarization

Create a transcription for URIs

Sample request

POST {endpoint}/speechtotext/v3.2/transcriptions

{
  "contentUrls": [
    "https://contoso.com/mystoragelocation",
    "https://contoso.com/myotherstoragelocation"
  ],
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked"
  },
  "locale": "en-US",
  "displayName": "Transcription using default model for en-US"
}

Sample response

{
  "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683",
  "model": {
    "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
  },
  "links": {
    "files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files"
  },
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT42S"
  },
  "lastActionDateTime": "2019-01-07T11:36:07Z",
  "status": "Succeeded",
  "createdDateTime": "2019-01-07T11:34:12Z",
  "locale": "en-US",
  "displayName": "Transcription using adapted model en-US",
  "customProperties": {
    "key": "value"
  }
}

Create a transcription from blob container

Sample request

POST {endpoint}/speechtotext/v3.2/transcriptions

{
  "contentContainerUrl": "https://customspeech-usw.blob.core.windows.net/artifacts/audiofiles/",
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked"
  },
  "locale": "en-US",
  "displayName": "Transcription of storage container using default model for en-US"
}

Sample response

Location: https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683
{
  "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683",
  "model": {
    "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
  },
  "links": {
    "files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files"
  },
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT42S"
  },
  "lastActionDateTime": "2019-01-07T11:36:07Z",
  "status": "Succeeded",
  "createdDateTime": "2019-01-07T11:34:12Z",
  "locale": "en-US",
  "displayName": "Transcription using adapted model en-US",
  "customProperties": {
    "key": "value"
  }
}

Create a transcription with basic two-speaker diarization

Sample request

POST {endpoint}/speechtotext/v3.2/transcriptions

{
  "contentUrls": [
    "https://contoso.com/mystoragelocation"
  ],
  "properties": {
    "diarizationEnabled": true,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked"
  },
  "locale": "en-US",
  "displayName": "Transcription using basic two-speaker diarization"
}

Sample response

{
  "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683",
  "model": {
    "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
  },
  "links": {
    "files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files"
  },
  "properties": {
    "diarizationEnabled": true,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT42S"
  },
  "lastActionDateTime": "2019-01-07T11:36:07Z",
  "status": "Succeeded",
  "createdDateTime": "2019-01-07T11:34:12Z",
  "locale": "en-US",
  "displayName": "Transcription using basic two-speaker diarization",
  "customProperties": {
    "key": "value"
  }
}

Create a transcription with language identification

Sample request

POST {endpoint}/speechtotext/v3.2/transcriptions

{
  "contentUrls": [
    "https://contoso.com/mystoragelocation"
  ],
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "languageIdentification": {
      "mode": "Single",
      "candidateLocales": [
        "fr-FR",
        "nl-NL",
        "el-GR"
      ],
      "speechModelMapping": {
        "nl-NL": {
          "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
        }
      }
    }
  },
  "locale": "fr-FR",
  "displayName": "Transcription using language identification with three candidate languages, 'fr-FR' as fallback locale and a custom model for transcribing utterances that were classified as 'nl-NL' locale."
}

Sample response

{
  "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683",
  "model": {
    "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
  },
  "links": {
    "files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files"
  },
  "properties": {
    "diarizationEnabled": false,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT42S",
    "languageIdentification": {
      "mode": "Single",
      "candidateLocales": [
        "fr-FR",
        "nl-NL",
        "el-GR"
      ],
      "speechModelMapping": {
        "nl-NL": {
          "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
        }
      }
    }
  },
  "lastActionDateTime": "2019-01-07T11:36:07Z",
  "status": "Succeeded",
  "createdDateTime": "2019-01-07T11:34:12Z",
  "locale": "fr-FR",
  "displayName": "Transcription using language identification with three candidate languages, 'fr-FR' as fallback locale and a custom model for transcribing utterances that were classified as 'nl-NL' locale.",
  "customProperties": {
    "key": "value"
  }
}

Create a transcription with multispeaker diarization

Sample request

POST {endpoint}/speechtotext/v3.2/transcriptions

{
  "contentUrls": [
    "https://contoso.com/mystoragelocation"
  ],
  "properties": {
    "diarizationEnabled": true,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "diarization": {
      "speakers": {
        "minCount": 3,
        "maxCount": 5
      }
    }
  },
  "locale": "en-US",
  "displayName": "Transcription using diarization for audio that is known to contain speech from 3-5 speakers"
}

Sample response

{
  "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683",
  "model": {
    "self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/models/827712a5-f942-4997-91c3-7c6cde35600b"
  },
  "links": {
    "files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.2/transcriptions/ba7ea6f5-3065-40b7-b49a-a90f48584683/files"
  },
  "properties": {
    "diarizationEnabled": true,
    "wordLevelTimestampsEnabled": false,
    "displayFormWordLevelTimestampsEnabled": false,
    "channels": [
      0,
      1
    ],
    "punctuationMode": "DictatedAndAutomatic",
    "profanityFilterMode": "Masked",
    "duration": "PT42S",
    "diarization": {
      "speakers": {
        "minCount": 3,
        "maxCount": 5
      }
    }
  },
  "lastActionDateTime": "2019-01-07T11:36:07Z",
  "status": "Succeeded",
  "createdDateTime": "2019-01-07T11:34:12Z",
  "locale": "en-US",
  "displayName": "Transcription using diarization for audio that is known to contain speech from 3-5 speakers",
  "customProperties": {
    "key": "value"
  }
}

Definitions

Name Description
DetailedErrorCode

DetailedErrorCode

DiarizationProperties

DiarizationProperties

DiarizationSpeakersProperties

DiarizationSpeakersProperties

EntityError

EntityError

EntityReference

EntityReference

Error

Error

ErrorCode

ErrorCode

InnerError

InnerError

LanguageIdentificationMode

LanguageIdentificationMode

LanguageIdentificationProperties

LanguageIdentificationProperties

ProfanityFilterMode

ProfanityFilterMode

PunctuationMode

PunctuationMode

Status

Status

Transcription

Transcription

TranscriptionLinks

TranscriptionLinks

TranscriptionProperties

TranscriptionProperties

DetailedErrorCode

DetailedErrorCode

Name Type Description
DataImportFailed

string

Data import failed.

DeleteNotAllowed

string

Delete not allowed.

DeployNotAllowed

string

Deploy not allowed.

DeployingFailedModel

string

Deploying failed model.

EmptyRequest

string

Empty Request.

EndpointCannotBeDefault

string

Endpoint cannot be default.

EndpointNotUpdatable

string

Endpoint not updatable.

EndpointWithoutLogging

string

Endpoint without logging.

ExceededNumberOfRecordingsUris

string

Exceeded number of recordings uris.

FailedDataset

string

Failed dataset.

Forbidden

string

Forbidden.

InUseViolation

string

In use violation.

InaccessibleCustomerStorage

string

Inaccessible customer storage.

InvalidAdaptationMapping

string

Invalid adaptation mapping.

InvalidBaseModel

string

Invalid base model.

InvalidCallbackUri

string

Invalid callback uri.

InvalidChannels

string

Invalid channels.

InvalidCollection

string

Invalid collection.

InvalidDataset

string

Invalid dataset.

InvalidDocument

string

Invalid Document.

InvalidDocumentBatch

string

Invalid Document Batch.

InvalidLocale

string

Invalid locale.

InvalidLogDate

string

Invalid log date.

InvalidLogEndTime

string

Invalid log end time.

InvalidLogId

string

Invalid log id.

InvalidLogStartTime

string

Invalid log start time.

InvalidModel

string

Invalid model.

InvalidModelUri

string

Invalid model uri.

InvalidParameter

string

Invalid parameter.

InvalidParameterValue

string

Invalid parameter value.

InvalidPayload

string

Invalid payload.

InvalidPermissions

string

Invalid permissions.

InvalidPrerequisite

string

Invalid prerequisite.

InvalidProductId

string

Invalid product id.

InvalidProject

string

Invalid project.

InvalidProjectKind

string

Invalid project kind.

InvalidRecordingsUri

string

Invalid recordings uri.

InvalidRequestBodyFormat

string

Invalid request body format.

InvalidSasValidityDuration

string

Invalid sas validity duration.

InvalidSkipTokenForLogs

string

Invalid skip token for logs.

InvalidSourceAzureResourceId

string

Invalid source Azure resource ID.

InvalidSubscription

string

Invalid subscription.

InvalidTest

string

Invalid test.

InvalidTimeToLive

string

Invalid time to live.

InvalidTopForLogs

string

Invalid top for logs.

InvalidTranscription

string

Invalid transcription.

InvalidWebHookEventKind

string

Invalid web hook event kind.

MissingInputRecords

string

Missing Input Records.

ModelCopyAuthorizationExpired

string

Expired ModelCopyAuthorization.

ModelDeploymentNotCompleteState

string

Model deployment not complete state.

ModelDeprecated

string

Model deprecated.

ModelExists

string

Model exists.

ModelMismatch

string

Model mismatch.

ModelNotDeployable

string

Model not deployable.

ModelVersionIncorrect

string

Model Version Incorrect.

NoUtf8WithBom

string

No utf8 with bom.

OnlyOneOfUrlsOrContainerOrDataset

string

Only one of urls or container or dataset.

ProjectGenderMismatch

string

Project gender mismatch.

QuotaViolation

string

Quota violation.

SingleDefaultEndpoint

string

Single default endpoint.

SkuLimitsExist

string

Sku limits exist.

SubscriptionNotFound

string

Subscription not found.

UnexpectedError

string

Unexpected error.

UnsupportedClassBasedAdaptation

string

Unsupported class based adaptation.

UnsupportedDelta

string

Unsupported delta.

UnsupportedDynamicConfiguration

string

Unsupported dynamic configuration.

UnsupportedFilter

string

Unsupported filter.

UnsupportedLanguageCode

string

Unsupported language code.

UnsupportedOrderBy

string

Unsupported order by.

UnsupportedPagination

string

Unsupported pagination.

UnsupportedTimeRange

string

Unsupported time range.

DiarizationProperties

DiarizationProperties

Name Type Description
speakers

DiarizationSpeakersProperties

DiarizationSpeakersProperties

DiarizationSpeakersProperties

DiarizationSpeakersProperties

Name Type Description
maxCount

integer

The maximum number of speakers for diarization. Must be less than 36 and larger than or equal to the minSpeakers property.

minCount

integer

A hint for the minimum number of speakers for diarization. Must be smaller than or equal to the maxSpeakers property.

EntityError

EntityError

Name Type Description
code

string

The code of this error.

message

string

The message for this error.

EntityReference

EntityReference

Name Type Description
self

string

The location of the referenced entity.

Error

Error

Name Type Description
code

ErrorCode

ErrorCode
High level error codes.

details

Error[]

Additional supportive details regarding the error and/or expected policies.

innerError

InnerError

InnerError
New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).

message

string

High level error message.

target

string

The source of the error. For example it would be "documents" or "document id" in case of invalid document.

ErrorCode

ErrorCode

Name Type Description
Conflict

string

Representing the conflict error code.

Forbidden

string

Representing the forbidden error code.

InternalCommunicationFailed

string

Representing the internal communication failed error code.

InternalServerError

string

Representing the internal server error error code.

InvalidArgument

string

Representing the invalid argument error code.

InvalidRequest

string

Representing the invalid request error code.

NotAllowed

string

Representing the not allowed error code.

NotFound

string

Representing the not found error code.

PipelineError

string

Representing the pipeline error error code.

ServiceUnavailable

string

Representing the service unavailable error code.

TooManyRequests

string

Representing the too many requests error code.

Unauthorized

string

Representing the unauthorized error code.

UnprocessableEntity

string

Representing the unprocessable entity error code.

UnsupportedMediaType

string

Representing the unsupported media type error code.

InnerError

InnerError

Name Type Description
code

DetailedErrorCode

DetailedErrorCode
Detailed error code enum.

details

object

Additional supportive details regarding the error and/or expected policies.

innerError

InnerError

InnerError
New Inner Error format which conforms to Cognitive Services API Guidelines which is available at https://microsoft.sharepoint.com/%3Aw%3A/t/CognitiveServicesPMO/EUoytcrjuJdKpeOKIK_QRC8BPtUYQpKBi8JsWyeDMRsWlQ?e=CPq8ow. This contains required properties ErrorCode, message and optional properties target, details(key value pair), inner error(this can be nested).

message

string

High level error message.

target

string

The source of the error. For example it would be "documents" or "document id" in case of invalid document.

LanguageIdentificationMode

LanguageIdentificationMode

Name Type Description
Continuous

string

Continuous language identification (Default).

Single

string

Single language identification.

LanguageIdentificationProperties

LanguageIdentificationProperties

Name Type Default value Description
candidateLocales

string[]

The candidate locales for language identification (example ["en-US", "de-DE", "es-ES"]). A minimum of 2 and a maximum of 10 candidate locales, including the main locale for the transcription, is supported for continuous mode. For single language identification, the maximum number of candidate locales is unbounded.

mode

LanguageIdentificationMode

Continuous

LanguageIdentificationMode
The mode used for language identification.

speechModelMapping

<string,  EntityReference>

An optional mapping of locales to speech model entities. If no model is given for a locale, the default base model is used. Keys must be locales contained in the candidate locales, values are entities for models of the respective locales.

ProfanityFilterMode

ProfanityFilterMode

Name Type Description
Masked

string

Mask the profanity with * except of the first letter, e.g., f***

None

string

Disable profanity filtering.

Removed

string

Remove profanity.

Tags

string

Add "profanity" XML tags</Profanity>

PunctuationMode

PunctuationMode

Name Type Description
Automatic

string

Automatic punctuation.

Dictated

string

Dictated punctuation marks only, i.e., explicit punctuation.

DictatedAndAutomatic

string

Dictated punctuation marks or automatic punctuation.

None

string

No punctuation.

Status

Status

Name Type Description
Failed

string

The long running operation has failed.

NotStarted

string

The long running operation has not yet started.

Running

string

The long running operation is currently processing.

Succeeded

string

The long running operation has successfully completed.

Transcription

Transcription

Name Type Description
contentContainerUrl

string

A URL for an Azure blob container that contains the audio files. A container is allowed to have a maximum size of 5GB and a maximum number of 10000 blobs. The maximum size for a blob is 2.5GB. Container SAS should contain 'r' (read) and 'l' (list) permissions. This property will not be returned in a response.

contentUrls

string[]

A list of content urls to get audio files to transcribe. Up to 1000 urls are allowed. This property will not be returned in a response.

createdDateTime

string

The time-stamp when the object was created. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

customProperties

object

The custom properties of this entity. The maximum allowed key length is 64 characters, the maximum allowed value length is 256 characters and the count of allowed entries is 10.

dataset

EntityReference

EntityReference

description

string

The description of the object.

displayName

string

The display name of the object.

lastActionDateTime

string

The time-stamp when the current status was entered. The time stamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

links

TranscriptionLinks

TranscriptionLinks

locale

string

The locale of the contained data. If Language Identification is used, this locale is used to transcribe speech for which no language could be detected.

model

EntityReference

EntityReference

project

EntityReference

EntityReference

properties

TranscriptionProperties

TranscriptionProperties

self

string

The location of this entity.

status

Status

Status
Describe the current state of the API.

TranscriptionLinks

Name Type Description
files

string

The location to get all files of this entity. See operation "Transcriptions_ListFiles" for more details.

TranscriptionProperties

TranscriptionProperties

Name Type Description
channels

integer[]

A collection of the requested channel numbers. In the default case, the channels 0 and 1 are considered.

destinationContainerUrl

string

The requested destination container.

Remarks

When a destination container is used in combination with a timeToLive, the metadata of a transcription will be deleted normally, but the data stored in the destination container, including transcription results, will remain untouched, because no delete permissions are required for this container.
To support automatic cleanup, either configure blob lifetimes on the container, or use "Bring your own Storage (BYOS)" instead of destinationContainerUrl, where blobs can be cleaned up.

diarization

DiarizationProperties

DiarizationProperties

diarizationEnabled

boolean

A value indicating whether diarization (speaker identification) is requested. The default value is false. If this field is set to true and the improved diarization system is configured by specifying DiarizationProperties, the improved diarization system will provide diarization for a configurable range of speakers. If this field is set to true and the improved diarization system is not enabled (not specifying DiarizationProperties), the basic diarization system will distinguish between up to two speakers. No extra charges are applied for the basic diarization.

The basic diarization system is deprecated and will be removed in the next major version of the API. This diarizationEnabled setting will also be removed.

displayFormWordLevelTimestampsEnabled

boolean

A value indicating whether word level timestamps for the display form are requested. The default value is false.

duration

string

The duration of the transcription. The duration is encoded as ISO 8601 duration ("PnYnMnDTnHnMnS", see https://en.wikipedia.org/wiki/ISO_8601#Durations).

email

string

The email address to send email notifications to in case the operation completes. The value will be removed after successfully sending the email.

error

EntityError

EntityError

languageIdentification

LanguageIdentificationProperties

LanguageIdentificationProperties

profanityFilterMode

ProfanityFilterMode

ProfanityFilterMode
Mode of profanity filtering.

punctuationMode

PunctuationMode

PunctuationMode
The mode used for punctuation.

timeToLive

string

How long the transcription will be kept in the system after it has completed. Once the transcription reaches the time to live after completion (successful or failed) it will be automatically deleted. Not setting this value or setting it to 0 will disable automatic deletion. The longest supported duration is 31 days. The duration is encoded as ISO 8601 duration ("PnYnMnDTnHnMnS", see https://en.wikipedia.org/wiki/ISO_8601#Durations).

wordLevelTimestampsEnabled

boolean

A value indicating whether word level timestamps are requested. The default value is false.