中文文本合成语音问题

Anonymous
2023-07-12T04:35:26.2866667+00:00

在VS2022中,我在试着调用Azure TTS ,将中文文本合成语音。当text为纯英文时,可以合成并读出,但当text中包含中文或纯 中文时,程序出错,不能合成语音(如下图)。问题在哪 里?

QQ图片20230712123337

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,898 questions
C++
C++
A high-level, general-purpose programming language, created as an extension of the C programming language, that has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
3,851 questions
{count} votes

Accepted answer
  1. Minxin Yu 12,596 Reputation points Microsoft Vendor
    2023-07-19T02:32:18.8266667+00:00

    Hi, @Frank

    I modified the snippet to make sure Chinese can be stored in wstring.

    You can copy the snippet below and replace with your key. To avoid cmd echo, use system("chcp 936 > nul")

    std::wstring text; setlocale(LC_ALL, "chs"); getline(wcin,text);

    #include "stdafx.h"
    // <code>
    #include <iostream>
    #include <speechapi_cxx.h>
    #include<Windows.h>
    #include <string>
    #pragma execution_character_set("utf-8")
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void synthesizeSpeech()
    {
        system("chcp  936");
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("key", "eastasia");
    
        auto language = "zh-CN"; 
        config->SetSpeechSynthesisLanguage(language); 
        // Set the voice name, refer to https://aka.ms/speech/voices/neural for full list.
        config->SetSpeechSynthesisVoiceName("zh-CN-XiaoxiaoNeural");
    
        // Creates a speech synthesizer using the default speaker as audio output. The default spoken language is "en-us".
        auto synthesizer = SpeechSynthesizer::FromConfig(config);
    
        // Receive a text from console input and synthesize it to speaker.
        cout << "Type some text that you want to speak..." << std::endl;
        cout << "> ";
        std::wstring text;
        setlocale(LC_ALL, "chs");
       getline(wcin,text);
       
        wcout << L"Input Speech synthesized to speaker for text [" << text << "]" << std::endl;
    
    
        auto result = synthesizer->SpeakTextAsync(text).get();
    
        // Checks result.
        if (result->Reason == ResultReason::SynthesizingAudioCompleted)
        {
            wcout << L"Speech synthesized to speaker for text [" << text << "]" << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error)
            {
                cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    
        // This is to give some time for the speaker to finish playing back the audio
        cout << "Press enter to exit..." << std::endl;
        cin.get();
    }
    
    int wmain()
    {
        try
        {
            synthesizeSpeech();
        }
        catch (exception e)
        {
            cout << e.what();
        }
        return 0;
    }
    

    Best regards,

    Minxin Yu


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Dillon Silzer 57,631 Reputation points
    2023-07-12T05:59:56.2466667+00:00

    Hello Frank,

    Please try looking at the following:

    CPP Text-To-Speech quickstart doesnot work on Chinese

    https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/1603

    Here may be a solution:

    First off, the root cause of the problem is that the string format coming from the console isn't capturing the non-ascii characters properly. The (quickest) solution for that would be to use the std::wstring overloads for SpeakTextAsync to ensure properly encoded strings are passed in. (Alternately, the std::string overload will work with a UTF-8 encoded string) Later on the Speech SDK internally hits an error due to the string encoding not being what was expected, and doesn't return an overly useful error. I've opened a bug in our internal tracking system to address that.


    If this is helpful please accept answer.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.