Azure AI Speech

3 answers

Why my TTS is suddenly become bad? Speed & punctuation isn't working properly.

This morning I tried to work on my TTS file using Brian's voice. But once I listened to the speech, the punctuation & speed weren't working properly. Also, it seems that his voice became monotone. I've tried with an already-finished project to see if…

asked

etienne Brassard 25

answered

Alex Orfanoudakis 0

1 answer

azure prononciation assessment input video

can i give to azure prononciation assessment a video input ?

asked

Iheb Jandoubi 5

answered

romungi-MSFT 42,296 Microsoft Employee

1 answer

Azure Real Time Speech To Text fails to take input from Blob URL

I have implemented Azure Real Time Speech to Text using Speech SDK in Python for pre recorded audio files. It works fine when the input audio is on my machine. But fails when I give the input as the Blob url containing the audio. Please help!

asked

Indira Priyadarshini 60

commented

monica c 0

1 answer

Include custom audio files for keyword recognition training process

I am leveraging Azure Keyword Recognition service, it works pretty nice except some false wakeup. We've collected a bunch of false waking up audio files, and I was wondering whether there is some approach that we can include these false audio files into…

asked

Liu Wenbin (Lofty Team) 20

edited a comment

Liu Wenbin (Lofty Team) 20

1 answer

Azure Speech AI service Custom Commands Alternative

Hi, we are looking forward to using Azure AI services especially the speech service to build a bot that does certain tasks based on speech for example if we ask the bot to "Make a reservation for Instrument A from 9 AM to 10 AM" then the bot…

asked

Varun Surana 0

commented

Harlow Burgess 0

1 answer

When will more avatar's be available?

The Text to Speech Avatar has been in preview for about six months. Any idea when a full release will be done? And what will be in that release? additional avatars adjustable clothing, hair, skin tone, ... ??? Thx

asked

Roy Jensen 20

accepted

Roy Jensen 20

0 answers

关于Azure AI Speech “zh-CN-XiaochenNeural” 音色异常

Since early April, the tone of the "Xiaochen" model has been experiencing abnormalities. At that time, attempts were made in regions such as East Asia, Southeast Asia, and the East US, all of which showed abnormalities, except for the Japan…

asked

斌周 0

commented

santoshkc 4,435 Microsoft Vendor

0 answers

How to get spoken Language in audio file with Azure Speech sdk in C#?

Hi, I need to detect what's spoken language in an audio file. I have already read the documentation about language identification for speech service but in the SpeechRecognitionResult object result I don't have the recognized language code. Is there a…

asked

Matteo Gianfermi 0

edited a comment

Matteo Gianfermi 0

0 answers

azure prononciation assessment async assessment

i'am using azure speech recognizer sdk , to do the prononciation assessment of an audio file. the problem when the speech is in french the results are always low , and no expressive const language = await detectSingleSpeechLanguage(text) …

asked

Iheb Jandoubi 5

commented

romungi-MSFT 42,296 Microsoft Employee

0 answers

SSML: Using <lang xml:lang=""> within a multilingual voice sounds incorrect / unlike when used with the language-specific voice

I am developing a TTS application that pronounces "nonsense words" with specific language pronunciations. For example, I am using Polish language voices to pronounce non-Polish words. If I use a Polish-specific language, I hear what I expect…

asked

mkb13 11

commented

dupammi 7,050 Microsoft Vendor

1 answer

Retirement Announcement - Upgrade to Text-to-Speech Neural Voice on 31 August 2024

Text-to-Speech currently supports both standard and neural voices. However, since the neural voices provide more natural sounding speech output, and thus, a better end-user experience, we are retiring the standard voices on 31st August 2024 and they will…

asked

romungi-MSFT 42,296 Microsoft Employee

commented

Vinod Mankare 0

0 answers

Error while trying to train a 202240228 Whisper Large v2 baseline model

When trying to train a custom speech model using a dataset containing an audio file and its transcript, the model failed to train due to an internal error. Can anyone provide any insights on how to troubleshoot this issue?

asked

Engineering 0

commented

Engineering 0

0 answers

Is there any way to dub audios maintaining its original intonation, breaks and speed?

I've a voice audio that has a lot of deeper and higher tones and some breaks and "word-emphasis" in specific moments, but, when using the "Speech Translation" functionality, this audio loses all of its life (all this complexity),…

asked

Lucas 0

commented

santoshkc 4,435 Microsoft Vendor

0 answers

Not able to use Azure AI Speech Avatar on ReactJs

Hello, I am trying to implement Live chat avatar using ReactJS in my application. When implementing the sample code, I am getting the following console logs: is TURN server active? yes Avatar started. Speech and avatar synthesized to video…

asked

Jivi Health 0

commented

VasaviLankipalle-MSFT 14,576

0 answers

Azure Text to Speech F0 (Free) Tier Limits

Hi, I have the F0 (Free) Tier. I send a request to TTS service and get the blendshape data and voice. When I make a request, the first 4 get a response. The 5th one does not return a response anymore. If i restart my server, I can make another 4 request…

asked

Rob Enriquez 0

commented

Rob Enriquez 0

1 answer

Speech Recognition Live transcription not detecting any other language instead of English

Hi, I am using Speech Recognition resource in my application for live transcription. It's perfectly going with English language but when I am trying to say in Hindi then it's not detecting. I want to create my application for multiple languages used in…

asked

Jagwant singh 0

commented

Jagwant singh 0

0 answers

zh-CN-XiaochenNeural Abnormal timbre

zh-CN-XiaochenNeural, abnormal timbre. The same problem occurred in October last year. https://learn.microsoft.com/en-us/answers/questions/1431823/the-timbre-of-the-voice-of-zh-cn-xiaochenneural-ha —————————————————————— How long will it take to recover…

asked

斌周 0

commented

斌周 0

0 answers

Is it possible to stream Groq LLM responses as and when I get it into Azure TTS?

Hi! I'm trying to build a real time LLM conversation bot, and need it to be as low latency as possible. I have successfully set up TTS audio output streaming…

asked

arunnair 0

commented

dupammi 7,050 Microsoft Vendor

1 answer

How to have multiple mstts:audioduration in a single <speak>?

I'm trying to adjust the duration of individual phrases so that the synthesized voice matches with the voice in the original audio. It's working perfectly when done like this: <speak xmlns="http://www.w3.org/2001/10/synthesis"…

asked

Lucas 0

answered

dupammi 7,050 Microsoft Vendor

1 answer

Do Text to Speech containers TTS provide visemes and blendshapes like the API?

I'm currently using the Speech API and consuming the visemes and blendshapes that are returned. In an effort to reduce latency I would like to run the speech services locally via the text to speech container. Does the response of the container STT…

asked

Matt Ma 0

answered

Matt Ma 0

Filter

Content

1,411 questions with Azure AI Speech tags

Why my TTS is suddenly become bad? Speed & punctuation isn't working properly.

azure prononciation assessment input video

Azure Real Time Speech To Text fails to take input from Blob URL

Include custom audio files for keyword recognition training process

Azure Speech AI service Custom Commands Alternative

When will more avatar's be available?

关于Azure AI Speech “zh-CN-XiaochenNeural” 音色异常

How to get spoken Language in audio file with Azure Speech sdk in C#?

azure prononciation assessment async assessment

SSML: Using <lang xml:lang=""> within a multilingual voice sounds incorrect / unlike when used with the language-specific voice

Retirement Announcement - Upgrade to Text-to-Speech Neural Voice on 31 August 2024

Error while trying to train a 202240228 Whisper Large v2 baseline model

Is there any way to dub audios maintaining its original intonation, breaks and speed?

Not able to use Azure AI Speech Avatar on ReactJs

Azure Text to Speech F0 (Free) Tier Limits

Speech Recognition Live transcription not detecting any other language instead of English

zh-CN-XiaochenNeural Abnormal timbre

Is it possible to stream Groq LLM responses as and when I get it into Azure TTS?

How to have multiple mstts:audioduration in a single <speak>?

Do Text to Speech containers TTS provide visemes and blendshapes like the API?