Azure AI Speech

0 answers

More details about Whisper model via Azure AI Speech

Hello, I'm trying to integrate the whisper model via Azure AI Speech. Here are somethings I already know: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/whisper-overview Whisper model via Azure OpenAI Service is available in the…

asked

YJ Kim 0

1 answer

Excessive time on Custom STT transcriptions

Hello 🙂 I've trained a custom STT model using Azure Speech Services. I'm currently testing it with REST API requests, as in the How to Recognize Speech (REST) docs. What I've noticed is that as the number of seconds the audio has increases, so does the…

asked

Bruno Goncalves Vaz (P) 20

answered

Azar 21,645 MVP

0 answers

Speech-to-Text batch transcribe API in germanycentralwest doesn't work

Last Friday (May 31 2024) we started getting the following errors on all transcripts sent to the batch transcription API on our speech resource in…

asked

Matej the Mete 20

commented

Matej the Mete 20

0 answers

Pronunciation Assessment: Inconsistent Results

Hi, I'm experiencing very inconsistent results with the pronunciation assessment SDK for the same audio file when using different regions. I have tested the swedencentral and the westeurope regions. I tested them in different, languages, and the results…

asked

Jordan C 0

commented

Jordan C 0

0 answers

How to deploy Live chat avatar based on the sample code in azure?

Hi, I follow this guide on how to setup Live Chat Avatar https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/yinhew/avatar/samples/js/browser/avatar and managed to setup it locally in my machine. Right how exactly do I deploy it into my…

asked

Amir Basha 40

commented

santoshkc 6,710 Microsoft Vendor

2 answers

How to collect user voice in real-time from the browser and then send it to Azure Speech-to-Text via WebSocket?

I'm almost driven crazy by this problem. The audio stream I capture with MediaRecorder on Chrome only supports the webm format, while the Azure API only supports wav and ogg formats. And there is no complete example telling me how to create a support for…

asked

CodeKidz 35

answered

Kenneth Díaz González 0

0 answers

Linux App service running fastAPI Application using Azure Speech SDK doesnt produce recognition and translation results

I Built a Fast API application on a local development environment on Windows, python and Azure Speech SDK and Azure Translation Service. the application will transcribe videos and translate text to another language as desired. it is working fine, and the…

asked

AliAzizeh 5

edited a comment

Kenneth Díaz González 0

1 answer

Azure Cognitive Speech to Text Duplicate Sentences returned on Channel 0 and Channel 1

We are developing a solution using Azure Cognitive Speech to Text service and have an issue with duplicate sentences being returned. We have some cases with dual channel audio which appear to transcribe correctly with speaker channels. We have stereo…

asked

James Nicolson 5

edited an answer

Radhika Jagtap 20

0 answers

How to use both pronunciation file and a structured file in custom speech to text and speech studio?

I am using Microsoft's speech studio's Custom speech to text service. I have created a project and when uploading the data files there are different format in which I can upload. For my project I am using multiple formats of data, I want to use both…

asked

TR Ganesh 0

commented

dupammi 7,955 Microsoft Vendor

1 answer

Do we have a batch transcription in microsoft Azure speech to text cognitive services using java sdk or java Rest API ?

we have a embedded speech(microphone) speech to text cognitive service support in java but I want to implement a batch transcription using microsoft Azure cognitive services using java language, do we java sdk or java Rest API support for batch…

asked

Ganesh P 0

answered

Ganesh P 0

0 answers

Real time diarization, for true!

Hi, i've decided to join Azure AI program due to this demo: https://speech.microsoft.com/portal/speechtotexttool In this demo, I can activate microphone, flagging the Diarization to True, and that's it! Now, when I've discovered by documentation that I…

asked

Marco Cocco 5

commented

Marco Cocco 5

0 answers

Pause and Resume Azure Ai Continuous Speech to Text Recognition

Hi, I'm trying figurin' out how to pause the speech recognition api, while is in its continuous mode. Pretty much same situation described…

asked

Marco Cocco 5

commented

Marco Cocco 5

0 answers

Can't preview a sound on Speech Studio

It happens on East US, S0

asked

Quill Zhou 0

commented

Quill Zhou 0

1 answer

How to prepare plain text data for speech service custom model training

Hi, I'm trying to train my custom speech-to-text model to improve its accuracy in recognizing industry-specific jargon(computer science). Q1: For example, some domain specific terminologies like 'LinkedList', 'HashMap', is it better to format as it is or…

asked

hexarrior 40

commented

santoshkc 6,710 Microsoft Vendor

1 answer

What are the HW or sound limitations for the echo cancellation algorithm in SpeechSDK

hi, I'm having some issues with the echo cancellation on my device, and I'm trying to use speech SDK, when I was analyzing the sounds that I record with microphone it seems that there are present higher harmonics which are 24dB less then primary…

asked

Faris Lemes 70

accepted

Faris Lemes 70

0 answers

How to have the control over the audio playing when text is converted to speech using Azure Speech Service?

Below is the code I am using to convert text to audio for a button click using Azure speech service, but I am unable to stop the audio that is playing, I would like to use the same button to stop the audio while it is playing. How to have the…

asked

Shivani V 0

commented

dupammi 7,955 Microsoft Vendor

0 answers

How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.

It feels like these sounds are meant to optimize the reading of complete sentences, but they can't read words in detail by syllables.

asked

sxmud 0

commented

dupammi 7,955 Microsoft Vendor

1 answer

Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

Our speech recognition scenario mainly involves a mix of Chinese and English. Currently, we have chosen the Chinese language recognition type (as there is no specific type for mixed Chinese and English). Besides manually adding hotwords and conducting…

asked

hexarrior 40

accepted

hexarrior 40

1 answer

Improving Speech to Text Accuracy for Industry-Specific Terminology with Azure AI Service

Hi all, I want to improve the accuracy of reading industry-specific terminology(in Japanese) using Azure AI service's Speech to Text. The challenge is that these terms can have different meanings in general contexts versus industry-specific contexts. How…

asked

KT 150

commented

KT 150

1 answer

no voice when I click "play" button to create speech from text

no voice when I click "play" button to create speech from text, my laptop voice turned on already.

asked

Grace Xiong 0 Microsoft Employee

commented

santoshkc 6,710 Microsoft Vendor

Filter

Content

1,531 questions with Azure AI Speech tags

More details about Whisper model via Azure AI Speech

Excessive time on Custom STT transcriptions

Speech-to-Text batch transcribe API in germanycentralwest doesn't work

Pronunciation Assessment: Inconsistent Results

How to deploy Live chat avatar based on the sample code in azure?

How to collect user voice in real-time from the browser and then send it to Azure Speech-to-Text via WebSocket?

Linux App service running fastAPI Application using Azure Speech SDK doesnt produce recognition and translation results

Azure Cognitive Speech to Text Duplicate Sentences returned on Channel 0 and Channel 1

How to use both pronunciation file and a structured file in custom speech to text and speech studio?

Do we have a batch transcription in microsoft Azure speech to text cognitive services using java sdk or java Rest API ?

Real time diarization, for true!

Pause and Resume Azure Ai Continuous Speech to Text Recognition

Can't preview a sound on Speech Studio

How to prepare plain text data for speech service custom model training

What are the HW or sound limitations for the echo cancellation algorithm in SpeechSDK

How to have the control over the audio playing when text is converted to speech using Azure Speech Service?

How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.

Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

Improving Speech to Text Accuracy for Industry-Specific Terminology with Azure AI Service

no voice when I click "play" button to create speech from text