Hello @OtteMMarco-3696
Welcome to the Microsoft Q&A and thank you for posting your questions here.
Regarding to your question, you encountered an issue while generating a valid WAV file from Azure Text to Speech output in Unity and you were asking how to generate wave file correctly.
Firstly,
Verify that the file path you're providing to FromWavFileOutput is correct and accessible within Unity. Based on your code snippet, it looks like you're setting it to Raw48Khz16BitMonoPcm, which should be appropriate for generating a WAV file. So, ensure that the folder structure is properly set up and that Unity can write to the specified location.
Secondly,
Instead of using FromWavFileOutput, you can try using SaveToWaveFileAsync to directly save the generated audio to a WAV file. This method is available in the SpeechSynthesizer class.
You can modify your code to use SaveToWaveFileAsync using the below example:
var speechSynthesisResult = await speechSynthesizer.SpeakTextAsync("Your text here");
await speechSynthesisResult.SaveToWaveFileAsync(filePath);
This above code sample will eliminate the need to use FromWavFileOutput, and have capacity in resolving any issues related to invalid WAV file formats.
Thirdly,
Ensure that you're correctly handling the audio data obtained from AudioDataStream.FromResult(result) and that the sample rate matches the configuration you've set (SampleRate * 120).
Finally,
If your import setting is correct, because Unity might be expecting specific encoding settings or metadata in the WAV file. You can adjust these settings by selecting the WAV file in the Unity Editor and inspecting its Import Settings. Then, implement proper error handling in your code. Sample on how to implement such could be seen in the GitHub link provided by @VasaviLankipalle-MSFT
You also have resources available by the right side of this page.
This is additions to your second post:
If you have look into that GitHub and try the above here and the problem persist.
Now, after saving the WAV file using SaveToWaveFileAsync, make sure to call AssetDatabase.Refresh() to ensure that Unity's asset database is updated with the new file. This step is crucial for Unity to recognize the newly saved WAV file as an asset.
Instead of relying on AssetDatabase.LoadAssetAtPath, try loading the AudioClip directly from the file using UnityWebRequestMultimedia.GetAudioClip. This method loads audio files asynchronously and can be more reliable for dynamically loaded assets. You can modify your code to load the AudioClip from the WAV file like the below example:
IEnumerator LoadAudioClip(string filePath, Action<AudioClip> callback)
{
UnityWebRequest www = UnityWebRequestMultimedia.GetAudioClip("file://" + filePath, AudioType.WAV);
yield return www.SendWebRequest();
if (www.result == UnityWebRequest.Result.Success)
{
AudioClip audioClip = DownloadHandlerAudioClip.GetContent(www);
callback?.Invoke(audioClip);
}
else
{
Debug.LogError("Failed to load audio clip: " + www.error);
callback?.Invoke(null);
}
}
Call this coroutine after saving the WAV file and pass the file path. Once the AudioClip is loaded, use the callback function to assign it to the AudioSource.
By implementing these suggestions and ensuring proper file handling and coroutine execution, you should be able to reliably save the Azure Text to Speech output to a WAV file and load it as an AudioClip in Unity for playback.
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please remember to "Accept Answer" if answer helped, so that others in the community facing similar issues can easily find the solution.
Best Regards,
Sina Salam