Troubleshooting AudioConfig and AudioProcessingOptions Compatibility Issues

longfan an 0 Reputation points
2023-11-17T09:20:32.4333333+00:00

When I attempted the following code, I noticed that AudioConfig and AudioProcessingOptions don't seem to work together. The code runs smoothly when I remove AudioProcessingOptions, but as soon as I add it back, the code stops functioning and freezes. I'm wondering if there's an issue with the way I'm using them. I've consulted the relevant documentation (insert documentation link), but couldn't find any discrepancies. Yet, my code doesn't operate as expected. I would greatly appreciate any guidance on this matter.

SpeechConfig speechConfig = SpeechConfig.fromSubscription(speechKey, speechRegion);

speechConfig.setSpeechRecognitionLanguage("en-US");

int operate = AudioProcessingConstants.AUDIO_INPUT_PROCESSING_NONE | AudioProcessingConstants.AUDIO_INPUT_PROCESSING_ENABLE_VOICE_ACTIVITY_DETECTION;

AudioProcessingOptions option = AudioProcessingOptions.create(operate, PresetMicrophoneArrayGeometry.Mono);

AudioConfig audioInput = AudioConfig.fromWavFileInput("/output.wav");

Semaphore stopRecognitionSemaphore = new Semaphore(0);

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,730 questions
Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
7,771 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Azar 22,860 Reputation points MVP
    2023-11-17T10:37:05.05+00:00

    Hi longfan an

    you set for AudioProcessingOptions is correct. The AudioProcessingOptions.create method expects two parameters: flags and geometry. In your case, operate is the flags parameter, and PresetMicrophoneArrayGeometry.Mono is the geometry parameter. Make sure that these parameters are appropriate for your use case.

    and also check that the path to your audio file (/output.wav) is correct and that the file exists withe same typo.this maybe one of the reasons, that might cause the code to freeze.

    If you found this answer useful kindly accept thanks much


  2. longfan an 0 Reputation points
    2023-11-18T14:56:12.1133333+00:00

    My audio file does exist, and the format of the audio file is as follows:

    yaml
    Copy code
    Metadata:
        encoder         : Lavf59.27.100
      Duration: 00:00:08.83, bitrate: 256 kb/s
      Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s
    

    There is no exception when no parameters are added. However, after adding parameters for voice detection, the code cannot execute normally. I noticed that the official demo also uses a similar approach. Therefore, I am unsure where the issue is occurring.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.