English to Chinese Document Translation Character Encoding Problem

T. Thor Leach 1 Reputation point
2021-10-29T17:06:00.71+00:00

I'm trying to use the Microsoft Document Translation API to translate from English into multiple languages. The current workflow involves uploading a txt file to a source container in my Azure Storage Account, requesting a translation of that document/blob into 4 languages (Spanish, Portuguese, French, and Chinese), downloading the translated documents/blobs, and then saving those documents to the server.

The Spanish, Portuguese, French documents all seem to work (although I am having to do a conversion from Windows-1252 to UTF-8 to get rid of some character encoding issues with a few of the Spanish and Protuguese characters (mb_convert_encoding($response, "Windows-1252", "UTF-8");). Unfortunately, I have been unable to find a way to convert the Chinese document into actual Chinese characters. If I write the document/blob directly to the disk, then I get a return that looks like this: ï¼ˆå ™è¿°è€…ï¼‰å©´å„¿çš„å¤§è„‘ç”Ÿé•¿. If I try to run either a utf8_decode or mb_convert_encoding function, I get something that looks like this: ????????????. If I run a utf8_encode function, the result looks like this: (堙述者)婴儿的å.

Can anyone tell me what I'm doing wrong here? I'm able to use the Microsoft Text Translator to translate English to Chinese without a problem, but I just can't seem to get the Document Translation API to work...

Azure Translator
Azure Translator
An Azure service to easily conduct machine translation with a simple REST API call.
417 questions
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 53,136 Reputation points
    2021-11-02T08:19:03.963+00:00

    @T. Thor Leach

    Sorry we can not reproduce this issue without your sample document, I would highly recommend you to raise a support ticket, connect with a support engineer to investigate it deeper. Please let us know if you do not have support plan, we can help you to enable a free support ticket. Thanks.

    Regards,
    Yutong

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.