Language data import failed: Invalid or empty textproc out file!.

Kesavaraj V 0 Reputation points
2024-10-27T18:48:28.78+00:00

I am trying to train a custom speech to text model for Malayalam. When I am trying to upload the text data, I am getting the below error.

User's image

Can someone explain to me what the error is?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,061 questions
{count} votes

1 answer

Sort by: Most helpful
  1. RevelinoB 3,675 Reputation points
    2024-10-27T19:58:09.8533333+00:00

    Hi Kesavaraj,

    I've come across a similar issue before, and here’s what I’ve found usually causes it:

    The error message, “Language data import failed: Invalid or empty textproc out file!” suggests that something might be off with the file you're trying to upload. Here are a few things to check:

    Empty or Incorrect Text File: This might sound basic, but make sure the file isn’t empty. Even a minor formatting issue can cause problems, so double-check that the file actually contains the expected text data for Malayalam.

    File Encoding: For languages like Malayalam, the encoding can be crucial. If the text file isn’t saved in UTF-8, for example, it can throw errors during the import. Try re-saving it in UTF-8 and see if that helps.

    File Path or Name: Sometimes, the issue can come down to a simple naming or path problem. Avoid using special characters in the file name or path, as they might interfere with the upload.

    Platform Requirements: Different platforms have specific requirements for training data, especially for Speech-to-Text models. Double-check the documentation to make sure your file format matches what the system expects.

    If all else fails, you could try re-uploading the file or looking into any documentation that might provide extra details. Let me know if this helps or if there’s anything specific you’re still stuck on!


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.