Azure AI Speech

1 answer

Custom list phrase / vocabulary on batch transcriptions?

Hi, I need the ability to provide a custom list of phrases for every transcription depending on the customer who will be transcribing a file. Consequently, I need something like this …

asked

Rafael Castelo 6

commented

Christopher Parsons 0

1 answer

Is it possible to implement using NodeJS Microsoft SDK, real-time streaming and viseme events?

Hi all, I would like to know is it possible to implement a Microsoft SDK/NodeJS based app for text-to-speech using reali-time streaming (meaning that the server/client starts playback as soon as the first chunk is received) and having access to viseme…

asked

Stamatis Kourtis 20

commented

navba-MSFT 17,110 Microsoft Employee

0 answers

Endpoint with custom model returns different result to Speech Studio

I have created a custom model in Speech Studio that uses sample text and structured text. I have uploaded some test samples into Speech Studio and have tested the model against these samples. I then deployed the custom model as an endpoint and am…

asked

van Boheemen, Matthew 1

commented

van Boheemen, Matthew 1

1 answer

Detect and Select Microphone Input Device for the Azure Speech Recognition (Speech To Text) cloud service in Unity

Hello, After reading all the documentation and studying an example that used NAudio to detect and select audio input devices, I noticed that NAudio does not work properly in Unity. Also, I tried feeding a series of audio samples from Unity to Azure's…

asked

D4N005H 0

edited a comment

D4N005H 0

2 answers

How to get speaker identification in speech translation code (using MS Cognitive Services)?

I want to perform speaker identification in speech translation code (using MS Cognitive Services) in a way similar to the speech transcription code in the following (via accessing the SpeakerId property): …

asked

Mitch Clark 20

accepted

Mitch Clark 20

0 answers

How to gracefully handle error from Azure text to speech?

import azure.cognitiveservices.speech as speechsdk import os import random import sentry_sdk from app.common.constants import END_OF_STREAM from app.common.utils import TimeIt, is_debug_mode, capture_exception class AzureTTS: def __init__(self, …

asked

LeetGPT 60

commented

dupammi 6,315 Microsoft Vendor

1 answer

Reuse SpeechRecognizer and stream for multiple audio streams?

Hi team, is there any best practice on how to reuse the SpeechRecognizer for stream recognizing user audios? In our application, we know where user start talking and end talking so we can signal speech recognizer for it. The reason I wanted to reuse…

asked

LeetGPT 60

accepted

LeetGPT 60

0 answers

Is it possible to change speech recognition parameters in "Recognizing" or "Recognized" handlers?

Hi I'm having the callbacks for Recognizing and Recognized handlers for the speech recognition, also, I have keyword recognition and continues recognition. Is there a possibility to update recognition parameters in those callbacks? Use case scenario is…

asked

Faris Lemes 20

commented

Faris Lemes 20

0 answers

FileNotFoundError: Could not find module 'C:\Users\ATIF ALTAF\OneDrive\Desktop\Adil\Check\.venv\lib\site-packages\azure\cognitiveservices\speech\Microsoft.CognitiveServices.Speech.core.dll' (or one of its dependencies). Try using the full path with constr

I'm trying to install pip install azure-cognitiveservices-speech and use it by this: import azure.cognitiveservices.speech as speechsdk. but it gives me error. FileNotFoundError: Could not find module 'C:\Users\ATIF…

asked

Atif Altaf 0

edited the question

VasaviLankipalle-MSFT 14,181

1 answer

How to use Azure Speech to text display text format features in Python?

Hi team, I am following this link for setting ITN, punctuation: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/display-text-format?pivots=programming-language-python However I couldn't find any related code snippet or samples in…

asked

LeetGPT 60

commented

dupammi 6,315 Microsoft Vendor

1 answer

transcribe real time during twilio phone call?

Hello, I'm able to make a call from twilio, once the call ends I'm passing .wav file to azure Speech To Text, I feel it's taking a lot of time transcribing data. Is there anyway during phone call itself we can transcribe or any other approach we can…

asked

Rakesh Indla 5

commented

Gobillion YC S21 0

1 answer

Request for Support in Developing a Neural TTS System in Uzbek Language

Dear Azure Speech Studio Support Team, I hope this message finds you well. I am writing to express my keen interest in developing a neural Text-to-Speech (TTS) system utilizing Azure Speech Studio, specifically tailored for the Uzbek language. My…

asked

Otabek Otamurodov 0

commented

santoshkc 4,100 Microsoft Vendor

1 answer

批量文本转语音，我记得之前我看文档说只有部分地区可以使用此api，但是现在没找到相关限制了，现在所有地区都可以调用批量文本转语音的api了吗

批量文本转语音，我记得之前我看文档说只有部分地区可以使用此api，但是现在没找到相关限制了，现在所有地区都可以调用批量文本转语音的api了吗? Batch text to voice, I remember before I read the document said that only some areas can use this api, but now I did not find the relevant restrictions, now all regions can call…

asked

佳鑫朱 0

commented

navba-MSFT 17,110 Microsoft Employee

1 answer

Persistent Issue with Azure Text-to-Speech: Missing Initial Words in Sentences

I'm encountering a recurring issue with Azure's Text-to-Speech service, where it consistently fails to include the first few words of every sentence in the generated voice output. This problem persists regardless of the specific text being synthesized.…

asked

Rukshan 0

answered

dupammi 6,315 Microsoft Vendor

1 answer

Can I use voice gallery to customize my own voice? How to make it, the production cycle, and how much I charge.

Can I use voice gallery to customize my own voice? How to make it, the production cycle, and how much I charge. please show me, how to make it, i want to do my own voice !

asked

#LIU CHANG# 0

edited a comment

dupammi 6,315 Microsoft Vendor

0 answers

Azure text to speech wordboundary event always returns zero for audio offset and duration

I have a call back connected to the wordboundary event and it was working okay until a few days ago. Now the event always returns 0 for audio offset and duration but the audio itself is fine. I'm using azure-cognitiveservices-speech 1.36.0. Problem…

asked

Matt Ma 0

commented

VasaviLankipalle-MSFT 14,181

0 answers

Illegal Invocation Error When Using Speech SDK in Cloudflare Workers Environment

I am encountering an Illegal invocation error when trying to use microsoft-cognitiveservices-speech-sdk within a Cloudflare Workers environment. The same code works as expected in a Node.js environment, but it fails when deployed to Cloudflare Workers. …

asked

SonBs 0

commented

VasaviLankipalle-MSFT 14,181

0 answers

Azure Neural TTS Web Player instance and plugin for React

Hello! After reading the post Azure Neural TTS Web Player: let your website speak for itself,I recently sent an email to ttsplayer@microsoft.com requesting for an Azure Neural TTS Web Player for my website. In the post, the author directed readers…

asked

Henry Chan 0

edited a comment

Henry Chan 0

1 answer

Pronouncing the words "Hi" and "Fine" incorrectly when using Multilingual voices

When I use Multilingual voices (like Emma Multilingual, Andrew Multilingual, Jenny Multilingual,...) in English for text-to-speech, the output is mispronouncing the single word "Hi" or "Fine". Please help me to fix it. Waiting for…

asked

Ngoc Thai Tran 0

commented

dupammi 6,315 Microsoft Vendor

2 answers

Why my TTS is suddenly become bad? Speed & punctuation isn't working properly.

This morning I tried to work on my TTS file using Brian's voice. But once I listened to the speech, the punctuation & speed weren't working properly. Also, it seems that his voice became monotone. I've tried with an already-finished project to see if…

asked

etienne Brassard 25

edited a comment

Kit Chan 5

Filter

Content

1,392 questions with Azure AI Speech tags

Custom list phrase / vocabulary on batch transcriptions?

Is it possible to implement using NodeJS Microsoft SDK, real-time streaming and viseme events?

Endpoint with custom model returns different result to Speech Studio

Detect and Select Microphone Input Device for the Azure Speech Recognition (Speech To Text) cloud service in Unity

How to get speaker identification in speech translation code (using MS Cognitive Services)?

How to gracefully handle error from Azure text to speech?

Reuse SpeechRecognizer and stream for multiple audio streams?

Is it possible to change speech recognition parameters in "Recognizing" or "Recognized" handlers?

FileNotFoundError: Could not find module 'C:\Users\ATIF ALTAF\OneDrive\Desktop\Adil\Check\.venv\lib\site-packages\azure\cognitiveservices\speech\Microsoft.CognitiveServices.Speech.core.dll' (or one of its dependencies). Try using the full path with constr

How to use Azure Speech to text display text format features in Python?

transcribe real time during twilio phone call?

Request for Support in Developing a Neural TTS System in Uzbek Language

批量文本转语音，我记得之前我看文档说只有部分地区可以使用此api，但是现在没找到相关限制了，现在所有地区都可以调用批量文本转语音的api了吗

Persistent Issue with Azure Text-to-Speech: Missing Initial Words in Sentences

Can I use voice gallery to customize my own voice? How to make it, the production cycle, and how much I charge.

Azure text to speech wordboundary event always returns zero for audio offset and duration

Illegal Invocation Error When Using Speech SDK in Cloudflare Workers Environment

Azure Neural TTS Web Player instance and plugin for React

Pronouncing the words "Hi" and "Fine" incorrectly when using Multilingual voices

Why my TTS is suddenly become bad? Speed & punctuation isn't working properly.