Azure AI Speech

0 answers

How to have the control over the audio playing when text is converted to speech using Azure Speech Service?

Below is the code I am using to convert text to audio for a button click using Azure speech service, but I am unable to stop the audio that is playing, I would like to use the same button to stop the audio while it is playing. How to have the…

asked

Shivani V 0

commented

dupammi 8,035 Microsoft Vendor

0 answers

How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.

It feels like these sounds are meant to optimize the reading of complete sentences, but they can't read words in detail by syllables.

asked

sxmud 0

commented

dupammi 8,035 Microsoft Vendor

1 answer

Can't preview a sound on Speech Studio

It happens on East US, S0

asked

Quill Zhou 25

answered

VasaviLankipalle-MSFT 15,946

1 answer

Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

Our speech recognition scenario mainly involves a mix of Chinese and English. Currently, we have chosen the Chinese language recognition type (as there is no specific type for mixed Chinese and English). Besides manually adding hotwords and conducting…

asked

hexarrior 40

accepted

hexarrior 40

1 answer

Improving Speech to Text Accuracy for Industry-Specific Terminology with Azure AI Service

Hi all, I want to improve the accuracy of reading industry-specific terminology(in Japanese) using Azure AI service's Speech to Text. The challenge is that these terms can have different meanings in general contexts versus industry-specific contexts. How…

asked

KT 150

commented

KT 150

1 answer

no voice when I click "play" button to create speech from text

no voice when I click "play" button to create speech from text, my laptop voice turned on already.

asked

Grace Xiong 0 Microsoft Employee

commented

santoshkc 6,955 Microsoft Vendor

1 answer

How to fix an issue where my 3D Blendshapes do not align with the audio.

I'm trying to apply viseme 3D Blend Shapes to drive my 3d avatar. When the result is returned, the audio plays before the response's FrameIndex and BlendShape. I received event.animation and used it to set the weight for each blend shape name. However,…

asked

Ananchai Mankhong 0

commented

Ananchai Mankhong 0

1 answer

Can I use phonetic language to create perfect speech

Can I use International Phonetic alphabetic translation in azure text to speech to come out with a near perfect speech? If so, how?

asked

Geoff Surtees 0

edited a comment

Stefano Michieletto 0

0 answers

How can I use Whisper on Azure AI Speech

Hi, I recently switched from using the whisper model via Azure OpenAI to using Azure AI Speech. However, I noticed that the quality of some transcriptions is worse on Azure AI Speech. On the below page it says that it is possible to use the whisper model…

asked

Julian 0

commented

dupammi 8,035 Microsoft Vendor

1 answer

How to get sentence word timestamp results for real-time speech recognition ?

I am using Golang's SDK this is my golang code func (m *microsoft) Do(ctx context.Context, path string) (string, error) { defer os.Remove(path) accessKeyConfig := AccessKeyList[rand.Intn(len(AccessKeyList))] subscription := accessKeyConfig.Key region…

asked

莓草 0

commented

navba-MSFT 20,635 Microsoft Employee

1 answer

Microsoft: fix captioning by Speech Studio

The captioning functionality in the Speech Studio is an utter failure. This is typical output: I encourage Microsoft to implement the functionality that allows the user to specify the number of lines of text (typically one or two), and the maximum…

asked

Roy Jensen 40

commented

navba-MSFT 20,635 Microsoft Employee

1 answer

create a basic voice-interactive dashboard

Hello Team, I need to create a basic voice-interactive dashboard using Azure Cognitive services like, Speech service, CLU(Conversational Language Understanding) & PowerBI.Also suggest if any other way to achieve this. It would be really helpful.

asked

Vijayakumar Elumalai 105

commented

YutongTie-MSFT 48,581

1 answer

Request to Increase Whisper Model Quota Limit

Hi Azure Community, I hope everyone is doing well. I am currently working on a project that requires a higher capacity of the Whisper model than my current Azure quota allows. I am seeking guidance on how to increase my Whisper model quota…

asked

narayanam Srinivasulu 0

commented

VasaviLankipalle-MSFT 15,946

1 answer

Azure AI Speech content filter

Hey everyone, I am using the Azure AI Speech api for real time transcription of conversations. The problem I am facing is that the content filter recognizes words such as the german 'dick' as offensive. This might be true in english, however in german…

asked

Julian 0

commented

navba-MSFT 20,635 Microsoft Employee

1 answer

Ingesting webpage URL for the open AI web app in Azure

Hi there. In the Azure open AI studio, there is an option for defining webpage URL when you add data for the app but based on the requirements in the Microsoft website, it can only extract text up tp 20 sublinks and also I can only put one URL in it. …

asked

Jalali, Hadi 40

commented

Mansi Gusain 0

0 answers

Failed to get HTTP platform singleton instance. Error: 27

Hello! I'm working with the Azure Speech Services SDK via python. The code worked well, until I started getting blank responses. Basically my request got cancelled, when checking the reason, I got this: #…

asked

Vitalii Brydinskyi 0

commented

VasaviLankipalle-MSFT 15,946

0 answers

I use speech to text and want to transcribe the corresponding text, but it keeps timing out without successful recognition. Why is this happening?

this is my file,and download it https://feedback.meitudata.com/public/file/yASWSTNPh2RE3Ncv.wav

asked

莓草 0

commented

VasaviLankipalle-MSFT 15,946

1 answer

Is each voice in the voice gallery based on a clone of one specific natural person or is it synthetic?

I would like to understand whether: Each voice in the voice gallery is based on a clone of one specific natural person? Voices are synthetic (similar to those from 11Labs Voice Design) that cannot be traced back to an individual person? Thank you!

asked

mpsb 0

commented

santoshkc 6,955 Microsoft Vendor

1 answer

Azure speech speaker differentiation

Hi, I would like to use azure speech to transcribe a meeting, however i want it to differentiate between anonymous speakers, eg speaker A, speaker B. Is it possible to do that. Are there any samplesand tutorials out there that I can just take and use?…

asked

jchoo 0

edited the question

AmaranS 3,865 Microsoft Vendor

0 answers

Is there a way to make speech service transcription faster (diarization with speakers differentiated)?

Currently the speed seems to be half the time for wav and 1:1 ratio for mp4 with gstreamer. From this post, it seems half the time for wav file is the…

asked

kk 0

commented

santoshkc 6,955 Microsoft Vendor

Filter

Content

1,555 questions with Azure AI Speech tags

How to have the control over the audio playing when text is converted to speech using Azure Speech Service?

How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.

Can't preview a sound on Speech Studio

Seeking Optimal Speech Transcription Service for Mixed Chinese and English Scenarios

Improving Speech to Text Accuracy for Industry-Specific Terminology with Azure AI Service

no voice when I click "play" button to create speech from text

How to fix an issue where my 3D Blendshapes do not align with the audio.

Can I use phonetic language to create perfect speech

How can I use Whisper on Azure AI Speech

How to get sentence word timestamp results for real-time speech recognition ?

Microsoft: fix captioning by Speech Studio

create a basic voice-interactive dashboard

Request to Increase Whisper Model Quota Limit

Azure AI Speech content filter

Ingesting webpage URL for the open AI web app in Azure

Failed to get HTTP platform singleton instance. Error: 27

I use speech to text and want to transcribe the corresponding text, but it keeps timing out without successful recognition. Why is this happening?

Is each voice in the voice gallery based on a clone of one specific natural person or is it synthetic?

Azure speech speaker differentiation

Is there a way to make speech service transcription faster (diarization with speakers differentiated)?