How to get speech to text session ID and transcription ID
If you use speech to text and need to open a support case, you're often asked to provide a Session ID or Transcription ID of the problematic transcriptions to debug the issue. This article explains how to get these IDs.
Note
- Session ID is used in real-time speech to text and speech translation.
- Transcription ID is used in batch transcription.
Getting Session ID
Real-time speech to text and speech translation use either the Speech SDK or the REST API for short audio.
To get the Session ID, when using SDK you need to:
- Enable application logging.
- Find the Session ID inside the log.
If you use Speech SDK for JavaScript, get the Session ID as described in this section.
If you use Speech CLI, you can also get the Session ID interactively. See details in this section.
With the speech to text REST API for short audio, you need to inject the session information in the requests. See details in this section.
Enable logging in the Speech SDK
Enable logging for your application as described in this article.
Get Session ID from the log
Open the log file your application produced and look for SessionId:
. The number that would follow is the Session ID you need. In the following log excerpt example, 0b734c41faf8430380d493127bd44631
is the Session ID.
[874193]: 218ms SPX_DBG_TRACE_VERBOSE: audio_stream_session.cpp:1238 [0000023981752A40]CSpxAudioStreamSession::FireSessionStartedEvent: Firing SessionStarted event: SessionId: 0b734c41faf8430380d493127bd44631
Get Session ID using JavaScript
If you use Speech SDK for JavaScript, you get Session ID with the help of sessionStarted
event from the Recognizer class.
See an example of getting Session ID using JavaScript in this sample. Look for recognizer.sessionStarted = onSessionStarted;
and then for function onSessionStarted
.
Get Session ID using Speech CLI
If you use Speech CLI, then you see the Session ID in SESSION STARTED
and SESSION STOPPED
console messages.
You can also enable logging for your sessions and get the Session ID from the log file as described in this section. Run the appropriate Speech CLI command to get the information on using logs:
spx help recognize log
spx help translate log
Provide Session ID using REST API for short audio
Unlike Speech SDK, Speech to text REST API for short audio doesn't automatically generate a Session ID. You need to generate it yourself and provide it within the REST request.
Generate a GUID inside your code or using any standard tool. Use the GUID value without dashes or other dividers. As an example we use 9f4ffa5113a846eba289aa98b28e766f
.
As a part of your REST request use X-ConnectionId=<GUID>
expression. For our example, a sample request looks like this:
https://eastus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&X-ConnectionId=9f4ffa5113a846eba289aa98b28e766f
9f4ffa5113a846eba289aa98b28e766f
is your Session ID.
Warning
The value of the parameter X-ConnectionId
should be in the format of GUID without dashes or other dividers. All other formats aren't supported and will be discarded by the Service.
Example. If the request contains expressions like these:
X-ConnectionId=9f4ffa51-13a8-46eb-a289-aa98b28e766f
(GUID with dividers)X-ConnectionId=Request9f4ffa5113a846eba289aa98b28e766f
(non-GUID)X-ConnectionId=5948f700d2a811ee
(non-GUID)
then the value of X-ConnectionId
will not be accepted by the system, and the Session won't be found in the logs.
Getting Transcription ID for Batch transcription
Batch transcription API is a subset of the Speech to text REST API.
The required Transcription ID is the GUID value contained in the main self
element of the Response body returned by requests, like Transcriptions_Create.
The following is and example response body of a Transcriptions_Create request. GUID value 537216f8-0620-4a10-ae2d-00bdb423b36f
found in the first self
element is the Transcription ID.
{
"self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions/537216f8-0620-4a10-ae2d-00bdb423b36f",
"model": {
"self": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/models/base/824bd685-2d45-424d-bb65-c3fe99e32927"
},
"links": {
"files": "https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions/537216f8-0620-4a10-ae2d-00bdb423b36f/files"
},
"properties": {
"diarizationEnabled": false,
"wordLevelTimestampsEnabled": false,
"channels": [
0,
1
],
"punctuationMode": "DictatedAndAutomatic",
"profanityFilterMode": "Masked"
},
"lastActionDateTime": "2021-11-19T14:09:51Z",
"status": "NotStarted",
"createdDateTime": "2021-11-19T14:09:51Z",
"locale": "ru-RU",
"displayName": "transcriptiontest"
}
Note
Use the same technique to determine different IDs required for debugging issues related to custom speech, like uploading a dataset using Datasets_Create request.
Note
You can also see all existing transcriptions and their Transcription IDs for a given Speech resource by using Transcriptions_Get request.