Edit

Get a speaker profile ID for the personal voice

To use personal voice in your application, you need to get a speaker profile ID. The speaker profile ID is used to generate synthesized audio with the text input provided.

You create a speaker profile ID based on the speaker's verbal consent statement and an audio prompt (a clean human voice sample between 5 - 90 seconds). The user's voice characteristics are encoded in the speakerProfileId property that's used for text to speech. For more information, see use personal voice in your application.

Note

The personal voice ID and speaker profile ID aren't the same. You can choose the personal voice ID, but the speaker profile ID is generated by the service. The personal voice ID is used to manage the personal voice. The speaker profile ID is used for text to speech.

Prompt audio format

The supported formats for prompt audio files are:

Format Sample rate Bit rate Bit depth
mp3 16 kHz, 24 kHz, 44.1 kHz, 48 kHz 128 kbps, 192 kbps, 256 kbps, 320 kbps /
wav 16 kHz, 24 kHz, 44.1 kHz, 48 kHz / 16-bit, 24-bit, 32-bit

Add a voice training dataset

These steps continue from the Fine-tune a model wizard you opened in Create a personal voice project and continued in Add user consent.

  1. On the Training data pane of the wizard, select one of the following options:

    • Upload data to upload a prerecorded audio prompt.
    • Record data to record the audio prompt directly in the portal.

Upload a prerecorded audio prompt

  1. In the Upload data pane, drag and drop the audio file into the upload area, or select Browse for a file to select it. The file must be a clean human voice sample between 5 and 90 seconds.
  2. Select Upload.

Record an audio prompt in the portal

  1. In the Record data pane, read and follow the recording tips:

    • Avoid background noise: Record in a quiet environment to minimize background noise for better audio quality.
    • Stay relaxed: Speak naturally and at a comfortable pace. Avoid rushing or over-enunciating.
    • Use a quality microphone: Use a headset or external microphone for best results. Avoid built-in laptop microphones.
    • Review quality metrics: After recording, review the quality scores to ensure your audio meets the required standards.
  2. Press the microphone button to start recording 5–90 seconds of audio.

  3. Stop the recording, review it, and then select Next to submit.

Test your personal voice

After the training data is processed, you can try out your personal voice in the Playground:

  1. Select Fine-tuning from the left pane, and then select the AI Service tab.
  2. Select the personal voice fine-tuning task you submitted.
  3. Select Open in Playground in the upper right.
  4. Enter plain text or SSML to synthesize speech with the personal voice.

To integrate personal voice in your application by using the speakerProfileId, see Use personal voice in your application.

You provide the audio files from a publicly accessible URL (PersonalVoices_Create) or upload the audio files (PersonalVoices_Post).

Create personal voice from a file

In this scenario, the audio files must be available locally.

To create a personal voice and get the speaker profile ID, use the PersonalVoices_Post operation of the custom voice API. Construct the request body according to the following instructions:

  • Set the required projectId property. See create a project.
  • Set the required consentId property. See add user consent.
  • Set the required audiodata property. You can specify one or more audio files in the same request. The maximum file size is 30 MB.

Make an HTTP POST request using the URI as shown in the following PersonalVoices_Post example.

  • Replace YourResourceKey with your Speech resource key.
  • Replace YourResourceName with your Speech resource name.
  • Replace JessicaPersonalVoiceId with a personal voice ID of your choice. The case sensitive ID will be used in the personal voice's URI and can't be changed later.
curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourResourceKey" -F 'projectId="ProjectId"' -F 'consentId="JessicaConsentId"' -F 'audiodata=@"D:\PersonalVoiceTest\CNVSample001.wav"' -F 'audiodata=@"D:\PersonalVoiceTest\CNVSample002.wav"' "https://YourResourceName.cognitiveservices.azure.com/customvoice/personalvoices/JessicaPersonalVoiceId?api-version=2026-01-01"

You should receive a response body in the following format:

{
  "id": "JessicaPersonalVoiceId",
  "speakerProfileId": "3059912f-a3dc-49e3-bdd0-02e449df1fe3",
  "projectId": "ProjectId",
  "consentId": "JessicaConsentId",
  "status": "NotStarted",
  "createdDateTime": "2024-09-01T05:30:00.000Z",
  "lastActionDateTime": "2024-09-02T10:15:30.000Z"
}

Use the speakerProfileId property to integrate personal voice in your text to speech application. For more information, see use personal voice in your application.

The response header contains the Operation-Location property. Use this URI to get details about the PersonalVoices_Post operation. Here's an example of the response header:

Operation-Location: https://YourResourceName.cognitiveservices.azure.com/customvoice/operations/1321a2c0-9be4-471d-83bb-bc3be4f96a6f?api-version=2026-01-01
Operation-Id: 1321a2c0-9be4-471d-83bb-bc3be4f96a6f

Create personal voice from a URL

In this scenario, the audio files must already be stored in an Azure Blob Storage container.

To create a personal voice and get the speaker profile ID, use the PersonalVoices_Create operation of the custom voice API. Construct the request body according to the following instructions:

  • Set the required projectId property. See create a project.
  • Set the required consentId property. See add user consent.
  • Set the required audios property. Within the audios property, set the following properties:
    • Set the required containerUrl property to the URL of the Azure Blob Storage container that contains the audio files. Use shared access signatures (SAS) for a container with both read and list permissions.
    • Set the required extensions property to the extensions of the audio files.
    • Optionally, set the prefix property to set a prefix for the blob name.

Make an HTTP PUT request using the URI as shown in the following PersonalVoices_Create example.

  • Replace YourResourceKey with your Speech resource key.
  • Replace YourResourceName with your Speech resource name.
  • Replace JessicaPersonalVoiceId with a personal voice ID of your choice. The case sensitive ID will be used in the personal voice's URI and can't be changed later.
curl -v -X PUT -H "Ocp-Apim-Subscription-Key: YourResourceKey" -H "Content-Type: application/json" -d '{
  "projectId": "ProjectId",
  "consentId": "JessicaConsentId",
  "audios": {
    "containerUrl": "https://contoso.blob.core.windows.net/voicecontainer?mySasToken",
    "prefix": "jessica/", 
    "extensions": [
      ".wav"
    ]
  }
} '  "https://YourResourceName.cognitiveservices.azure.com/customvoice/personalvoices/JessicaPersonalVoiceId?api-version=2026-01-01"

# Ensure the `containerUrl` has both read and list permissions. 
# Ensure the `.wav` files are located in the "jessica" folder within the container. The `prefix` matches all `.wav` files in the "jessica" folder. If there is no such folder, the prefix will match `.wav` files with names starting with "jessica". 

You should receive a response body in the following format:

{
  "id": "JessicaPersonalVoiceId",
  "speakerProfileId": "3059912f-a3dc-49e3-bdd0-02e449df1fe3",
  "projectId": "ProjectId",
  "consentId": "JessicaConsentId",
  "status": "NotStarted",
  "createdDateTime": "2024-09-01T05:30:00.000Z",
  "lastActionDateTime": "2024-09-02T10:15:30.000Z"
}

Use the speakerProfileId property to integrate personal voice in your text to speech application. For more information, see use personal voice in your application.

The response header contains the Operation-Location property. Use this URI to get details about the PersonalVoices_Create operation. Here's an example of the response header:

Operation-Location: https://YourResourceName.cognitiveservices.azure.com/customvoice/operations/1321a2c0-9be4-471d-83bb-bc3be4f96a6f?api-version=2026-01-01
Operation-Id: 1321a2c0-9be4-471d-83bb-bc3be4f96a6f

Next steps