Share via

Speech to text Specify source language not working for all other language english only retured

sam 206 Reputation points
2021-07-13T10:12:12.887+00:00

Hi, I am using Speech-to- text and uploading one hindi language wave file but am not getting response in hindi language rather getting english language text. Below mentioned in my code.

var config = SpeechConfig.FromHost(new Uri("ws://**********.io:5000/"));
var fileFullPath = await ReadFilePath(file);
var sourceLanguageConfig = SourceLanguageConfig.FromLanguage("hi-IN");
using (var audioConfig = AudioConfig.FromWavFileInput(fileFullPath))
using (var recognizer = new SpeechRecognizer(config, sourceLanguageConfig, audioConfig))
{
var result = await recognizer.RecognizeOnceAsync();

            if (result.Reason == ResultReason.RecognizedSpeech)
            {
                Console.WriteLine($"We recognized: {result.Text}");
            }
            else if (result.Reason == ResultReason.NoMatch)
            {
                Console.WriteLine($"NOMATCH: Speech could not be recognized.");
            }
            else if (result.Reason == ResultReason.Canceled)
            {
                var cancellation = CancellationDetails.FromResult(result);
                Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                if (cancellation.Reason == CancellationReason.Error)
                {
                    Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                    Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
                    Console.WriteLine($"CANCELED: Did you update the subscription info?");
                }
            }
            var data = new Response()
            {
                Prediction = result.Text

            };
            return new JsonResult(data);
        }
Azure Speech in Foundry Tools
Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

2 answers

Sort by: Most helpful
  1. Rohit Mungi 49,131 Reputation points Microsoft Employee Moderator
    2021-07-20T08:32:42.073+00:00

    @sam this is the method I am using to make the call. You should be able to print to file with the above suggestions and below snippet.
    I would also request you to try the same for debugging with your actual speech resource key and region config if the container endpoint is failing to do so.

        public static async Task RecognitionWithLanguageAndDetailedOutputAsync()  
        {  
            // Creates an instance of a speech config with specified subscription key and service region.  
            // Replace with your own subscription key and service region (e.g., "westus") if using the Azure service API  
            var config = SpeechConfig.FromSubscription("<your_key>", "<your_region>");  
            config.SpeechRecognitionLanguage = "hi-IN";  
    
            // Replace the language with your language in BCP-47 format, e.g., en-US.  
            //var language = "en-US";  
            config.OutputFormat = OutputFormat.Detailed;  
              
              
            FileStream filestream = new FileStream("out.txt", FileMode.Create);  
            var streamwriter = new StreamWriter(filestream);  
            streamwriter.AutoFlush = true;  
            Console.SetOut(streamwriter);  
            Console.SetError(streamwriter);  
    
    
            // Creates a speech recognizer for the specified language, using microphone as audio input.  
            // Requests detailed output format.  
            //using (var recognizer = new SpeechRecognizer(config, language))  
            using (var recognizer = new SpeechRecognizer(config))  
            {  
                // Starts recognizing.  
                //Console.WriteLine($"Say something in {language} ...");  
                Console.WriteLine($"Say something  ...");  
    
                // Starts speech recognition, and returns after a single utterance is recognized. The end of a  
                // single utterance is determined by listening for silence at the end or until a maximum of 15  
                // seconds of audio is processed.  The task returns the recognition text as result.  
                // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single  
                // shot recognition like command or query.  
                // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.  
                var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);  
    
                // Checks result.  
                if (result.Reason == ResultReason.RecognizedSpeech)  
                {  
                    Console.WriteLine($"RECOGNIZED: Text={result.Text}");  
                    Console.WriteLine("  DETAILED RESULTS:");  
    
                    var detailedResults = result.Best();  
                    foreach (var item in detailedResults) // NOTE: We need to put this in all languages, or take it out of CSharp  
                    {  
                        Console.WriteLine($"    Confidence: {item.Confidence}, Text: {item.Text}, LexicalForm: {item.LexicalForm}, NormalizedForm: {item.NormalizedForm}, MaskedNormalizedForm: {item.MaskedNormalizedForm}");  
                        // Console.W($"    Confidence: {item.Confidence}, Text: {item.Text}, LexicalForm: {item.LexicalForm}, NormalizedForm: {item.NormalizedForm}, MaskedNormalizedForm: {item.MaskedNormalizedForm}");  
                    }  
                }  
                else if (result.Reason == ResultReason.NoMatch)  
                {  
                    Console.WriteLine($"NOMATCH: Speech could not be recognized.");  
                }  
                else if (result.Reason == ResultReason.Canceled)  
                {  
                    var cancellation = CancellationDetails.FromResult(result);  
                    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");  
    
                    if (cancellation.Reason == CancellationReason.Error)  
                    {  
                        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");  
                        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");  
                        Console.WriteLine($"CANCELED: Did you update the subscription info?");  
                    }  
                }  
            }  
        }
    

    Was this answer helpful?


  2. Rohit Mungi 49,131 Reputation points Microsoft Employee Moderator
    2021-07-14T16:28:51.317+00:00

    @sam I think in this case there are couple of things you should check.

    1. Set the language as config.SpeechRecognitionLanguage = "hi-IN"; instead of setting the source language config. So, the speech recognizer will only use SpeechRecognizer(config,audioConfig) . Remove all reference to source language config.
    2. I think you are printing this to console. So, the output is essentially in hindi and it is printing anything it could recognize in english only for the language pack that is installed on your machine. If you try to print it to file the text should be in hindi. I added this at the beginning of the method to print all console text to file to verify the same.
      FileStream filestream = new FileStream("out.txt", FileMode.Create);  
      var streamwriter = new StreamWriter(filestream);  
      streamwriter.AutoFlush = true;  
      Console.SetOut(streamwriter);  
      Console.SetError(streamwriter);  
      

    The out file should be in your debug folder. Here is the sample output for a phrase i spoke.

    Say something ...
    RECOGNIZED: Text=धन्यवाद।
    DETAILED RESULTS:
    Confidence: 0.4711725, Text: धन्यवाद।, LexicalForm: धन्यवाद, NormalizedForm: धन्यवाद, MaskedNormalizedForm: धन्यवाद।
    Confidence: 0.4711725, Text: धन्यवाद सर, LexicalForm: धन्यवाद सर, NormalizedForm: धन्यवाद सर, MaskedNormalizedForm: धन्यवाद सर
    Confidence: 0.4711725, Text: धंन्यवाद, LexicalForm: धंन्यवाद, NormalizedForm: धंन्यवाद, MaskedNormalizedForm: धंन्यवाद
    Confidence: 0.4711725, Text: धन्यवाद दो, LexicalForm: धन्यवाद दो, NormalizedForm: धन्यवाद दो, MaskedNormalizedForm: धन्यवाद दो
    Confidence: 0.4711725, Text: धन्यवाद है, LexicalForm: धन्यवाद है, NormalizedForm: धन्यवाद है, MaskedNormalizedForm: धन्यवाद है

    Execution done. Your choice (0: Stop):

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.