Add new TTS technology/project (Coqui / Piper TTS) to SAPI

ThorstenVoice 5 Reputation points
2023-11-29T21:42:44.57+00:00

Hello,

originally i asked for support here but was redirected here.

I'd love to add a locally running TTS python based software (Coqui TTS and Piper TTS) to Windows SAPI system. I played around with adding new entries to the registry "...Speech_OneCore/Voices/Tokens/" and tested around that GUID "{179F3D56-1B0B-42B2-A962-59B7EF59FE1B}.

I also played around with Powershell scripts that are using SAPI which showed me the list of available voices and i was able to generate spoken audio. But i did not find any information on how i can add custom voices to SAPI.

I was unsure if this is possible in general, but it seems there are solutions for Amazon Polly or ReadSpeaker so it should be possible to add voices in another way.

In general i could create spoken audio by running a python based subprocess which takes some arguments (as the text) and returns wave data.

Can someone show me which way to go - thank you.

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
11,952 questions
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,854 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. romungi-MSFT 48,401 Reputation points Microsoft Employee
    2023-11-30T11:43:18.38+00:00

    @ThorstenVoice Azure AI speech primarily focuses on Azure speech service. As i understand from your case, you are looking to use voices from Azure speech service that can be used with your application. This can be done using the Azure speech REST API or the SDK or the Azure speech studio. If you are looking to integrate and use the voices in your application, you can use the SDK to list and synthesize text. Here is a learn course to get started where you can create the speech resource on Azure and test the same using the speech studio.

    Using the speech studio, you should be able to test the voice output and you can later integrate the TTS voice list APIs to list the voices in your application and then use the TTS API to synthesize text. You can use the documentation to try the samples with the required SDKs.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

  2. 999 Limerence 0 Reputation points
    2023-11-30T18:01:58.83+00:00

    hi Torsten, can you show there https://sonur.chimege.com/


  3. Mahmood Taghavi 0 Reputation points
    2023-12-15T11:17:03.14+00:00

    Hello Thorsten,

    Thank you for bringing up this interesting topic. I realized you like this wonderful text-to-speech engine to be equipped with the Speech API version 5 interface to make using it more convenient for Windows users.

    I remember a sample code from Microsoft Speech SDK explaining the implementation of SAPI5 for a sample speech engine.

    Please see the below link:

    https://www.microsoft.com/en-us/download/details.aspx?id=10121

    You need to download this file: SpeechSDK51.exe

    Also, you can find the updated documentation online:

    https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms720179(v=vs.85)

    Perhaps an updated sample code will also be available in the "Windows SDK":

    https://developer.microsoft.com/en-us/windows/downloads/windows-sdk/

    Please note using my referred documentation a developer with expertise in C programming language could implement the requested feature!

    Hope this wonderful TTS engine someday supports the SAPI5 interface :)

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.